最新消息:20210816 当前crifan.com域名已被污染,为防止失联,请关注(页面右下角的)公众号

【已解决】MongoDB的GridFS的所有文件的总大小

MongoDB crifan 2620浏览 0评论

折腾:

【已解决】把本地的音频字幕等数据存储到本地MongoDB数据库中

期间,此处已经用PyMongo去操作GridFS去把171个文件都保存进去了:

如图:

共有300个chunk文件块:

而原来那些文件,文件系统中共6G多:

现在此处想要搞清楚存入到GridFS后,总大小是多少。

gridfs total size

[mongodb-user] Get the total size of files stored in GridFS from the PHP driver – Grokbase

要自己每个加起来,好傻

Get the total size of files stored in GridFS from the PHP driver – Google Groups

chunks.stats()

去看看API文档中是否有:

没找到api文档

只有官网的:

GridFS — MongoDB Manual 3.6

没看到有status

PyCharm中的动态提示也看不到对应代码:

What’s the maximum size for GridFS on MongoDB? – Stack Overflow

db.collection.totalSize() — MongoDB Manual 3.6

直接有:db.collection.totalSize

Collection Methods — MongoDB Manual 3.6

还真的看到有:

db.collection.stats()

Reports on the state of a collection. Provides a wrapper around the collStats.

都属于:

mongo Shell Methods — MongoDB Manual 3.6

即:

Mongo的shell命令行中,才有这些接口

-》而此处的Python中的gridfs的driver,貌似没有实现这些功能。

去试试,好像不是想要的:

<code>&gt; db.fs.stats()
{
    "ns" : "gridfs.fs",
    "ok" : 0,
    "errmsg" : "Collection [gridfs.fs] not found."
}
</code>

好像是要的:

<code>&gt; db.fs.files.stats()
{
    "ns" : "gridfs.fs.files",
    "size" : 155368,
    "count" : 171,
    "avgObjSize" : 908,
    "storageSize" : 139264,
    "capped" : false,
    "wiredTiger" : {
        "metadata" : {
            "formatVersion" : 1
        },
        "creationString" : "access_pattern_hint=none,allocation_size=4KB,app_metadata=(formatVersion=1),assert=(commit_timestamp=none,read_timestamp=none),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=on,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=true),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_custom=(prefix=,start_generation=0,suffix=),merge_max=15,merge_min=0),memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=u",
        "type" : "file",
        "uri" : "statistics:table:collection-11-6711652102439670599",
        "LSM" : {
            "bloom filter false positives" : 0,
            "bloom filter hits" : 0,
            "bloom filter misses" : 0,
            "bloom filter pages evicted from cache" : 0,
            "bloom filter pages read into cache" : 0,
            "bloom filters in the LSM tree" : 0,
            "chunks in the LSM tree" : 0,
            "highest merge generation in the LSM tree" : 0,
            "queries that could have benefited from a Bloom filter that did not exist" : 0,
            "sleep for LSM checkpoint throttle" : 0,
            "sleep for LSM merge throttle" : 0,
            "total size of bloom filters" : 0
        },
        "block-manager" : {
            "allocations requiring file extension" : 45,
            "blocks allocated" : 186,
            "blocks freed" : 79,
            "checkpoint size" : 57344,
            "file allocation unit size" : 4096,
            "file bytes available for reuse" : 65536,
            "file magic number" : 120897,
            "file major version number" : 1,
            "file size in bytes" : 139264,
            "minor version number" : 0
        },
        "btree" : {
            "btree checkpoint generation" : 1463,
            "column-store fixed-size leaf pages" : 0,
            "column-store internal pages" : 0,
            "column-store variable-size RLE encoded values" : 0,
            "column-store variable-size deleted values" : 0,
            "column-store variable-size leaf pages" : 0,
            "fixed-record size" : 0,
            "maximum internal page key size" : 368,
            "maximum internal page size" : 4096,
            "maximum leaf page key size" : 2867,
            "maximum leaf page size" : 32768,
            "maximum leaf page value size" : 67108864,
            "maximum tree depth" : 3,
            "number of key/value pairs" : 0,
            "overflow pages" : 0,
            "pages rewritten by compaction" : 0,
            "row-store internal pages" : 0,
            "row-store leaf pages" : 0
        },
        "cache" : {
            "bytes currently in the cache" : 178660,
            "bytes read into cache" : 72411,
            "bytes written from cache" : 1635289,
            "checkpoint blocked page eviction" : 0,
            "data source pages selected for eviction unable to be evicted" : 0,
            "eviction walk passes of a file" : 78,
            "eviction walk target pages histogram - 0-9" : 57,
            "eviction walk target pages histogram - 10-31" : 21,
            "eviction walk target pages histogram - 128 and higher" : 0,
            "eviction walk target pages histogram - 32-63" : 0,
            "eviction walk target pages histogram - 64-128" : 0,
            "eviction walks abandoned" : 0,
            "eviction walks gave up because they restarted their walk twice" : 71,
            "eviction walks gave up because they saw too many pages and found no candidates" : 0,
            "eviction walks gave up because they saw too many pages and found too few candidates" : 0,
            "eviction walks reached end of tree" : 148,
            "eviction walks started from root of tree" : 76,
            "eviction walks started from saved location in tree" : 2,
            "hazard pointer blocked page eviction" : 0,
            "in-memory page passed criteria to be split" : 0,
            "in-memory page splits" : 0,
            "internal pages evicted" : 0,
            "internal pages split during eviction" : 0,
            "leaf pages split during eviction" : 13,
            "modified pages evicted" : 48,
            "overflow pages read into cache" : 0,
            "page split during eviction deepened the tree" : 0,
            "page written requiring lookaside records" : 0,
            "pages read into cache" : 3,
            "pages read into cache requiring lookaside entries" : 0,
            "pages requested from the cache" : 6854,
            "pages seen by eviction walk" : 270,
            "pages written from cache" : 118,
            "pages written requiring in-memory restoration" : 4,
            "tracked dirty bytes in the cache" : 0,
            "unmodified pages evicted" : 0
        },
        "cache_walk" : {
            "Average difference between current eviction generation when the page was last considered" : 0,
            "Average on-disk page image size seen" : 0,
            "Average time in cache for pages that have been visited by the eviction server" : 0,
            "Average time in cache for pages that have not been visited by the eviction server" : 0,
            "Clean pages currently in cache" : 0,
            "Current eviction generation" : 0,
            "Dirty pages currently in cache" : 0,
            "Entries in the root page" : 0,
            "Internal pages currently in cache" : 0,
            "Leaf pages currently in cache" : 0,
            "Maximum difference between current eviction generation when the page was last considered" : 0,
            "Maximum page size seen" : 0,
            "Minimum on-disk page image size seen" : 0,
            "Number of pages never visited by eviction server" : 0,
            "On-disk page image sizes smaller than a single allocation unit" : 0,
            "Pages created in memory and never written" : 0,
            "Pages currently queued for eviction" : 0,
            "Pages that could not be queued for eviction" : 0,
            "Refs skipped during cache traversal" : 0,
            "Size of the root page" : 0,
            "Total number of pages currently in cache" : 0
        },
        "compression" : {
            "compressed pages read" : 3,
            "compressed pages written" : 65,
            "page written failed to compress" : 0,
            "page written was too small to compress" : 53,
            "raw compression call failed, additional data available" : 0,
            "raw compression call failed, no additional data available" : 0,
            "raw compression call succeeded" : 0
        },
        "cursor" : {
            "bulk-loaded cursor-insert calls" : 0,
            "create calls" : 4,
            "cursor-insert key and value bytes inserted" : 1251147,
            "cursor-remove key bytes removed" : 2359,
            "cursor-update value bytes updated" : 0,
            "insert calls" : 1382,
            "modify calls" : 0,
            "next calls" : 4345,
            "prev calls" : 1,
            "remove calls" : 1211,
            "reserve calls" : 0,
            "reset calls" : 7797,
            "restarted searches" : 0,
            "search calls" : 3641,
            "search near calls" : 14,
            "truncate calls" : 0,
            "update calls" : 0
        },
        "reconciliation" : {
            "dictionary matches" : 0,
            "fast-path pages deleted" : 0,
            "internal page key bytes discarded using suffix compression" : 102,
            "internal page multi-block writes" : 0,
            "internal-page overflow keys" : 0,
            "leaf page key bytes discarded using prefix compression" : 0,
            "leaf page multi-block writes" : 19,
            "leaf-page overflow keys" : 0,
            "maximum blocks required for a page" : 1,
            "overflow values written" : 0,
            "page checksum matches" : 3,
            "page reconciliation calls" : 114,
            "page reconciliation calls for eviction" : 41,
            "pages deleted" : 35
        },
        "session" : {
            "object compaction" : 0,
            "open cursor count" : 3
        },
        "transaction" : {
            "update conflicts" : 0
        }
    },
    "nindexes" : 2,
    "totalIndexSize" : 81920,
    "indexSizes" : {
        "_id_" : 36864,
        "filename_1_uploadDate_1" : 45056
    },
    "ok" : 1
}
</code>

但是里面的size都不对啊:

155368=155368/1024=151.8 KB?

估计是chunks里面才对。

<code>&gt; db.fs.files.totalSize()
221184
&gt; db.fs.chunks.totalSize()
825204736
</code>

果然是的:

files总大小=221184=221184/1024=216KB

chunks总大小=825204736=825204736/1024*1024=786.98MB?

难道6GB的文件,保存进去,才只有700多MB?

还是我自己搞错了

-》是我自己搞错了:

-〉此处只保存了6G多里面的,其中171个(音质好的)音频文件,只是其中一部分而已。

-》其他还有很多pdf,音质不好的,没有保存。所以此处只有700多MB的audio file。

再去看看:

<code>&gt; db.fs.chunks.stats()
{
    "ns" : "gridfs.fs.chunks",
    "size" : 487720549,
    "count" : 1940,
    "avgObjSize" : 251402,
    "storageSize" : 825081856,
    "capped" : false,
    "wiredTiger" : {
        "metadata" : {
            "formatVersion" : 1
        },
        "creationString" : "access_pattern_hint=none,allocation_size=4KB,app_metadata=(formatVersion=1),assert=(commit_timestamp=none,read_timestamp=none),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=on,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=true),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_custom=(prefix=,start_generation=0,suffix=),merge_max=15,merge_min=0),memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=u",
        "type" : "file",
        "uri" : "statistics:table:collection-9-6711652102439670599",
        "LSM" : {
            "bloom filter false positives" : 0,
            "bloom filter hits" : 0,
            "bloom filter misses" : 0,
            "bloom filter pages evicted from cache" : 0,
            "bloom filter pages read into cache" : 0,
            "bloom filters in the LSM tree" : 0,
            "chunks in the LSM tree" : 0,
            "highest merge generation in the LSM tree" : 0,
            "queries that could have benefited from a Bloom filter that did not exist" : 0,
            "sleep for LSM checkpoint throttle" : 0,
            "sleep for LSM merge throttle" : 0,
            "total size of bloom filters" : 0
        },
        "block-manager" : {
            "allocations requiring file extension" : 7560,
            "blocks allocated" : 16018,
            "blocks freed" : 13974,
            "checkpoint size" : 428584960,
            "file allocation unit size" : 4096,
            "file bytes available for reuse" : 396480512,
            "file magic number" : 120897,
            "file major version number" : 1,
            "file size in bytes" : 825081856,
            "minor version number" : 0
        },
        "btree" : {
            "btree checkpoint generation" : 1469,
            "column-store fixed-size leaf pages" : 0,
            "column-store internal pages" : 0,
            "column-store variable-size RLE encoded values" : 0,
            "column-store variable-size deleted values" : 0,
            "column-store variable-size leaf pages" : 0,
            "fixed-record size" : 0,
            "maximum internal page key size" : 368,
            "maximum internal page size" : 4096,
            "maximum leaf page key size" : 2867,
            "maximum leaf page size" : 32768,
            "maximum leaf page value size" : 67108864,
            "maximum tree depth" : 3,
            "number of key/value pairs" : 0,
            "overflow pages" : 0,
            "pages rewritten by compaction" : 0,
            "row-store internal pages" : 0,
            "row-store leaf pages" : 0
        },
        "cache" : {
            "bytes currently in the cache" : 527641271,
            "bytes read into cache" : 593098632,
            "bytes written from cache" : 3999808086,
            "checkpoint blocked page eviction" : 0,
            "data source pages selected for eviction unable to be evicted" : 11,
            "eviction walk passes of a file" : 1061,
            "eviction walk target pages histogram - 0-9" : 299,
            "eviction walk target pages histogram - 10-31" : 48,
            "eviction walk target pages histogram - 128 and higher" : 0,
            "eviction walk target pages histogram - 32-63" : 66,
            "eviction walk target pages histogram - 64-128" : 648,
            "eviction walks abandoned" : 36,
            "eviction walks gave up because they restarted their walk twice" : 44,
            "eviction walks gave up because they saw too many pages and found no candidates" : 118,
            "eviction walks gave up because they saw too many pages and found too few candidates" : 13,
            "eviction walks reached end of tree" : 423,
            "eviction walks started from root of tree" : 215,
            "eviction walks started from saved location in tree" : 846,
            "hazard pointer blocked page eviction" : 5,
            "in-memory page passed criteria to be split" : 976,
            "in-memory page splits" : 482,
            "internal pages evicted" : 0,
            "internal pages split during eviction" : 0,
            "leaf pages split during eviction" : 412,
            "modified pages evicted" : 12625,
            "overflow pages read into cache" : 0,
            "page split during eviction deepened the tree" : 0,
            "page written requiring lookaside records" : 0,
            "pages read into cache" : 2418,
            "pages read into cache requiring lookaside entries" : 0,
            "pages requested from the cache" : 143872,
            "pages seen by eviction walk" : 251679,
            "pages written from cache" : 15948,
            "pages written requiring in-memory restoration" : 3,
            "tracked dirty bytes in the cache" : 0,
            "unmodified pages evicted" : 0
        },
        "cache_walk" : {
            "Average difference between current eviction generation when the page was last considered" : 0,
            "Average on-disk page image size seen" : 0,
            "Average time in cache for pages that have been visited by the eviction server" : 0,
            "Average time in cache for pages that have not been visited by the eviction server" : 0,
            "Clean pages currently in cache" : 0,
            "Current eviction generation" : 0,
            "Dirty pages currently in cache" : 0,
            "Entries in the root page" : 0,
            "Internal pages currently in cache" : 0,
            "Leaf pages currently in cache" : 0,
            "Maximum difference between current eviction generation when the page was last considered" : 0,
            "Maximum page size seen" : 0,
            "Minimum on-disk page image size seen" : 0,
            "Number of pages never visited by eviction server" : 0,
            "On-disk page image sizes smaller than a single allocation unit" : 0,
            "Pages created in memory and never written" : 0,
            "Pages currently queued for eviction" : 0,
            "Pages that could not be queued for eviction" : 0,
            "Refs skipped during cache traversal" : 0,
            "Size of the root page" : 0,
            "Total number of pages currently in cache" : 0
        },
        "compression" : {
            "compressed pages read" : 868,
            "compressed pages written" : 9137,
            "page written failed to compress" : 6694,
            "page written was too small to compress" : 120,
            "raw compression call failed, additional data available" : 0,
            "raw compression call failed, no additional data available" : 0,
            "raw compression call succeeded" : 0
        },
        "cursor" : {
            "bulk-loaded cursor-insert calls" : 0,
            "create calls" : 4,
            "cursor-insert key and value bytes inserted" : 3990244820,
            "cursor-remove key bytes removed" : 33198,
            "cursor-update value bytes updated" : 0,
            "insert calls" : 15864,
            "modify calls" : 0,
            "next calls" : 2124,
            "prev calls" : 1,
            "remove calls" : 13924,
            "reserve calls" : 0,
            "reset calls" : 61053,
            "restarted searches" : 0,
            "search calls" : 41900,
            "search near calls" : 82,
            "truncate calls" : 0,
            "update calls" : 0
        },
        "reconciliation" : {
            "dictionary matches" : 0,
            "fast-path pages deleted" : 0,
            "internal page key bytes discarded using suffix compression" : 15722,
            "internal page multi-block writes" : 12,
            "internal-page overflow keys" : 0,
            "leaf page key bytes discarded using prefix compression" : 0,
            "leaf page multi-block writes" : 509,
            "leaf-page overflow keys" : 0,
            "maximum blocks required for a page" : 1,
            "overflow values written" : 0,
            "page checksum matches" : 354,
            "page reconciliation calls" : 12777,
            "page reconciliation calls for eviction" : 11772,
            "pages deleted" : 12220
        },
        "session" : {
            "object compaction" : 0,
            "open cursor count" : 3
        },
        "transaction" : {
            "update conflicts" : 0
        }
    },
    "nindexes" : 2,
    "totalIndexSize" : 122880,
    "indexSizes" : {
        "_id_" : 61440,
        "files_id_1_n_1" : 61440
    },
    "ok" : 1
}
</code>

然后去Python中的gridfs中去试试:

<code>logging.info("fsCollection.stats()=%s", fsCollection.stats())
logging.info("fsCollection.totalSize()=%s", fsCollection.totalSize())
</code>

真的是没有:

<code>    logging.info("fsCollection.stats()=%s", fsCollection.stats())
AttributeError: 'GridFS' object has no attribute ‘stats'
    logging.info("fsCollection.totalSize()=%s", fsCollection.totalSize())
AttributeError: 'GridFS' object has no attribute 'totalSize'
</code>

【总结】

<code>&gt; db.fs.chunks.stats()
{
    "ns" : "gridfs.fs.chunks",
    "size" : 487720549,
    "count" : 1940,
    "avgObjSize" : 251402,
    "storageSize" : 825081856,
...
</code>

可以看到详细的信息,其中size和storageSize,和:

<code>&gt; db.fs.chunks.totalSize()
825204736
</code>

输出的值,都不太一样。

另外,对于files,也是类似的:

<code>&gt; db.fs.files.stats()
{
    "ns" : "gridfs.fs.files",
    "size" : 155368,
    "count" : 171,
    "avgObjSize" : 908,
    "storageSize" : 139264,
    "capped" : false,
</code>

和:

<code>&gt; db.fs.files.totalSize()
221184
</code>

转载请注明:在路上 » 【已解决】MongoDB的GridFS的所有文件的总大小

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
82 queries in 0.176 seconds, using 22.21MB memory