折腾:
【已解决】把本地的音频字幕等数据存储到本地MongoDB数据库中
期间,此处已经用PyMongo去操作GridFS去把171个文件都保存进去了:
如图:
共有300个chunk文件块:
而原来那些文件,文件系统中共6G多:
现在此处想要搞清楚存入到GridFS后,总大小是多少。
gridfs total size
[mongodb-user] Get the total size of files stored in GridFS from the PHP driver – Grokbase
要自己每个加起来,好傻
Get the total size of files stored in GridFS from the PHP driver – Google Groups
chunks.stats()
去看看API文档中是否有:
没找到api文档
只有官网的:
没看到有status
PyCharm中的动态提示也看不到对应代码:
What’s the maximum size for GridFS on MongoDB? – Stack Overflow
db.collection.totalSize() — MongoDB Manual 3.6
直接有:db.collection.totalSize
Collection Methods — MongoDB Manual 3.6
还真的看到有:
db.collection.stats()
Reports on the state of a collection. Provides a wrapper around the collStats.
都属于:
mongo Shell Methods — MongoDB Manual 3.6
即:
Mongo的shell命令行中,才有这些接口
-》而此处的Python中的gridfs的driver,貌似没有实现这些功能。
去试试,好像不是想要的:
<code>> db.fs.stats() { "ns" : "gridfs.fs", "ok" : 0, "errmsg" : "Collection [gridfs.fs] not found." } </code>
好像是要的:
<code>> db.fs.files.stats() { "ns" : "gridfs.fs.files", "size" : 155368, "count" : 171, "avgObjSize" : 908, "storageSize" : 139264, "capped" : false, "wiredTiger" : { "metadata" : { "formatVersion" : 1 }, "creationString" : "access_pattern_hint=none,allocation_size=4KB,app_metadata=(formatVersion=1),assert=(commit_timestamp=none,read_timestamp=none),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=on,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=true),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_custom=(prefix=,start_generation=0,suffix=),merge_max=15,merge_min=0),memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=u", "type" : "file", "uri" : "statistics:table:collection-11-6711652102439670599", "LSM" : { "bloom filter false positives" : 0, "bloom filter hits" : 0, "bloom filter misses" : 0, "bloom filter pages evicted from cache" : 0, "bloom filter pages read into cache" : 0, "bloom filters in the LSM tree" : 0, "chunks in the LSM tree" : 0, "highest merge generation in the LSM tree" : 0, "queries that could have benefited from a Bloom filter that did not exist" : 0, "sleep for LSM checkpoint throttle" : 0, "sleep for LSM merge throttle" : 0, "total size of bloom filters" : 0 }, "block-manager" : { "allocations requiring file extension" : 45, "blocks allocated" : 186, "blocks freed" : 79, "checkpoint size" : 57344, "file allocation unit size" : 4096, "file bytes available for reuse" : 65536, "file magic number" : 120897, "file major version number" : 1, "file size in bytes" : 139264, "minor version number" : 0 }, "btree" : { "btree checkpoint generation" : 1463, "column-store fixed-size leaf pages" : 0, "column-store internal pages" : 0, "column-store variable-size RLE encoded values" : 0, "column-store variable-size deleted values" : 0, "column-store variable-size leaf pages" : 0, "fixed-record size" : 0, "maximum internal page key size" : 368, "maximum internal page size" : 4096, "maximum leaf page key size" : 2867, "maximum leaf page size" : 32768, "maximum leaf page value size" : 67108864, "maximum tree depth" : 3, "number of key/value pairs" : 0, "overflow pages" : 0, "pages rewritten by compaction" : 0, "row-store internal pages" : 0, "row-store leaf pages" : 0 }, "cache" : { "bytes currently in the cache" : 178660, "bytes read into cache" : 72411, "bytes written from cache" : 1635289, "checkpoint blocked page eviction" : 0, "data source pages selected for eviction unable to be evicted" : 0, "eviction walk passes of a file" : 78, "eviction walk target pages histogram - 0-9" : 57, "eviction walk target pages histogram - 10-31" : 21, "eviction walk target pages histogram - 128 and higher" : 0, "eviction walk target pages histogram - 32-63" : 0, "eviction walk target pages histogram - 64-128" : 0, "eviction walks abandoned" : 0, "eviction walks gave up because they restarted their walk twice" : 71, "eviction walks gave up because they saw too many pages and found no candidates" : 0, "eviction walks gave up because they saw too many pages and found too few candidates" : 0, "eviction walks reached end of tree" : 148, "eviction walks started from root of tree" : 76, "eviction walks started from saved location in tree" : 2, "hazard pointer blocked page eviction" : 0, "in-memory page passed criteria to be split" : 0, "in-memory page splits" : 0, "internal pages evicted" : 0, "internal pages split during eviction" : 0, "leaf pages split during eviction" : 13, "modified pages evicted" : 48, "overflow pages read into cache" : 0, "page split during eviction deepened the tree" : 0, "page written requiring lookaside records" : 0, "pages read into cache" : 3, "pages read into cache requiring lookaside entries" : 0, "pages requested from the cache" : 6854, "pages seen by eviction walk" : 270, "pages written from cache" : 118, "pages written requiring in-memory restoration" : 4, "tracked dirty bytes in the cache" : 0, "unmodified pages evicted" : 0 }, "cache_walk" : { "Average difference between current eviction generation when the page was last considered" : 0, "Average on-disk page image size seen" : 0, "Average time in cache for pages that have been visited by the eviction server" : 0, "Average time in cache for pages that have not been visited by the eviction server" : 0, "Clean pages currently in cache" : 0, "Current eviction generation" : 0, "Dirty pages currently in cache" : 0, "Entries in the root page" : 0, "Internal pages currently in cache" : 0, "Leaf pages currently in cache" : 0, "Maximum difference between current eviction generation when the page was last considered" : 0, "Maximum page size seen" : 0, "Minimum on-disk page image size seen" : 0, "Number of pages never visited by eviction server" : 0, "On-disk page image sizes smaller than a single allocation unit" : 0, "Pages created in memory and never written" : 0, "Pages currently queued for eviction" : 0, "Pages that could not be queued for eviction" : 0, "Refs skipped during cache traversal" : 0, "Size of the root page" : 0, "Total number of pages currently in cache" : 0 }, "compression" : { "compressed pages read" : 3, "compressed pages written" : 65, "page written failed to compress" : 0, "page written was too small to compress" : 53, "raw compression call failed, additional data available" : 0, "raw compression call failed, no additional data available" : 0, "raw compression call succeeded" : 0 }, "cursor" : { "bulk-loaded cursor-insert calls" : 0, "create calls" : 4, "cursor-insert key and value bytes inserted" : 1251147, "cursor-remove key bytes removed" : 2359, "cursor-update value bytes updated" : 0, "insert calls" : 1382, "modify calls" : 0, "next calls" : 4345, "prev calls" : 1, "remove calls" : 1211, "reserve calls" : 0, "reset calls" : 7797, "restarted searches" : 0, "search calls" : 3641, "search near calls" : 14, "truncate calls" : 0, "update calls" : 0 }, "reconciliation" : { "dictionary matches" : 0, "fast-path pages deleted" : 0, "internal page key bytes discarded using suffix compression" : 102, "internal page multi-block writes" : 0, "internal-page overflow keys" : 0, "leaf page key bytes discarded using prefix compression" : 0, "leaf page multi-block writes" : 19, "leaf-page overflow keys" : 0, "maximum blocks required for a page" : 1, "overflow values written" : 0, "page checksum matches" : 3, "page reconciliation calls" : 114, "page reconciliation calls for eviction" : 41, "pages deleted" : 35 }, "session" : { "object compaction" : 0, "open cursor count" : 3 }, "transaction" : { "update conflicts" : 0 } }, "nindexes" : 2, "totalIndexSize" : 81920, "indexSizes" : { "_id_" : 36864, "filename_1_uploadDate_1" : 45056 }, "ok" : 1 } </code>
但是里面的size都不对啊:
155368=155368/1024=151.8 KB?
估计是chunks里面才对。
<code>> db.fs.files.totalSize() 221184 > db.fs.chunks.totalSize() 825204736 </code>
果然是的:
files总大小=221184=221184/1024=216KB
chunks总大小=825204736=825204736/1024*1024=786.98MB?
难道6GB的文件,保存进去,才只有700多MB?
还是我自己搞错了
-》是我自己搞错了:
-〉此处只保存了6G多里面的,其中171个(音质好的)音频文件,只是其中一部分而已。
-》其他还有很多pdf,音质不好的,没有保存。所以此处只有700多MB的audio file。
再去看看:
<code>> db.fs.chunks.stats() { "ns" : "gridfs.fs.chunks", "size" : 487720549, "count" : 1940, "avgObjSize" : 251402, "storageSize" : 825081856, "capped" : false, "wiredTiger" : { "metadata" : { "formatVersion" : 1 }, "creationString" : "access_pattern_hint=none,allocation_size=4KB,app_metadata=(formatVersion=1),assert=(commit_timestamp=none,read_timestamp=none),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=on,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=true),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_custom=(prefix=,start_generation=0,suffix=),merge_max=15,merge_min=0),memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=u", "type" : "file", "uri" : "statistics:table:collection-9-6711652102439670599", "LSM" : { "bloom filter false positives" : 0, "bloom filter hits" : 0, "bloom filter misses" : 0, "bloom filter pages evicted from cache" : 0, "bloom filter pages read into cache" : 0, "bloom filters in the LSM tree" : 0, "chunks in the LSM tree" : 0, "highest merge generation in the LSM tree" : 0, "queries that could have benefited from a Bloom filter that did not exist" : 0, "sleep for LSM checkpoint throttle" : 0, "sleep for LSM merge throttle" : 0, "total size of bloom filters" : 0 }, "block-manager" : { "allocations requiring file extension" : 7560, "blocks allocated" : 16018, "blocks freed" : 13974, "checkpoint size" : 428584960, "file allocation unit size" : 4096, "file bytes available for reuse" : 396480512, "file magic number" : 120897, "file major version number" : 1, "file size in bytes" : 825081856, "minor version number" : 0 }, "btree" : { "btree checkpoint generation" : 1469, "column-store fixed-size leaf pages" : 0, "column-store internal pages" : 0, "column-store variable-size RLE encoded values" : 0, "column-store variable-size deleted values" : 0, "column-store variable-size leaf pages" : 0, "fixed-record size" : 0, "maximum internal page key size" : 368, "maximum internal page size" : 4096, "maximum leaf page key size" : 2867, "maximum leaf page size" : 32768, "maximum leaf page value size" : 67108864, "maximum tree depth" : 3, "number of key/value pairs" : 0, "overflow pages" : 0, "pages rewritten by compaction" : 0, "row-store internal pages" : 0, "row-store leaf pages" : 0 }, "cache" : { "bytes currently in the cache" : 527641271, "bytes read into cache" : 593098632, "bytes written from cache" : 3999808086, "checkpoint blocked page eviction" : 0, "data source pages selected for eviction unable to be evicted" : 11, "eviction walk passes of a file" : 1061, "eviction walk target pages histogram - 0-9" : 299, "eviction walk target pages histogram - 10-31" : 48, "eviction walk target pages histogram - 128 and higher" : 0, "eviction walk target pages histogram - 32-63" : 66, "eviction walk target pages histogram - 64-128" : 648, "eviction walks abandoned" : 36, "eviction walks gave up because they restarted their walk twice" : 44, "eviction walks gave up because they saw too many pages and found no candidates" : 118, "eviction walks gave up because they saw too many pages and found too few candidates" : 13, "eviction walks reached end of tree" : 423, "eviction walks started from root of tree" : 215, "eviction walks started from saved location in tree" : 846, "hazard pointer blocked page eviction" : 5, "in-memory page passed criteria to be split" : 976, "in-memory page splits" : 482, "internal pages evicted" : 0, "internal pages split during eviction" : 0, "leaf pages split during eviction" : 412, "modified pages evicted" : 12625, "overflow pages read into cache" : 0, "page split during eviction deepened the tree" : 0, "page written requiring lookaside records" : 0, "pages read into cache" : 2418, "pages read into cache requiring lookaside entries" : 0, "pages requested from the cache" : 143872, "pages seen by eviction walk" : 251679, "pages written from cache" : 15948, "pages written requiring in-memory restoration" : 3, "tracked dirty bytes in the cache" : 0, "unmodified pages evicted" : 0 }, "cache_walk" : { "Average difference between current eviction generation when the page was last considered" : 0, "Average on-disk page image size seen" : 0, "Average time in cache for pages that have been visited by the eviction server" : 0, "Average time in cache for pages that have not been visited by the eviction server" : 0, "Clean pages currently in cache" : 0, "Current eviction generation" : 0, "Dirty pages currently in cache" : 0, "Entries in the root page" : 0, "Internal pages currently in cache" : 0, "Leaf pages currently in cache" : 0, "Maximum difference between current eviction generation when the page was last considered" : 0, "Maximum page size seen" : 0, "Minimum on-disk page image size seen" : 0, "Number of pages never visited by eviction server" : 0, "On-disk page image sizes smaller than a single allocation unit" : 0, "Pages created in memory and never written" : 0, "Pages currently queued for eviction" : 0, "Pages that could not be queued for eviction" : 0, "Refs skipped during cache traversal" : 0, "Size of the root page" : 0, "Total number of pages currently in cache" : 0 }, "compression" : { "compressed pages read" : 868, "compressed pages written" : 9137, "page written failed to compress" : 6694, "page written was too small to compress" : 120, "raw compression call failed, additional data available" : 0, "raw compression call failed, no additional data available" : 0, "raw compression call succeeded" : 0 }, "cursor" : { "bulk-loaded cursor-insert calls" : 0, "create calls" : 4, "cursor-insert key and value bytes inserted" : 3990244820, "cursor-remove key bytes removed" : 33198, "cursor-update value bytes updated" : 0, "insert calls" : 15864, "modify calls" : 0, "next calls" : 2124, "prev calls" : 1, "remove calls" : 13924, "reserve calls" : 0, "reset calls" : 61053, "restarted searches" : 0, "search calls" : 41900, "search near calls" : 82, "truncate calls" : 0, "update calls" : 0 }, "reconciliation" : { "dictionary matches" : 0, "fast-path pages deleted" : 0, "internal page key bytes discarded using suffix compression" : 15722, "internal page multi-block writes" : 12, "internal-page overflow keys" : 0, "leaf page key bytes discarded using prefix compression" : 0, "leaf page multi-block writes" : 509, "leaf-page overflow keys" : 0, "maximum blocks required for a page" : 1, "overflow values written" : 0, "page checksum matches" : 354, "page reconciliation calls" : 12777, "page reconciliation calls for eviction" : 11772, "pages deleted" : 12220 }, "session" : { "object compaction" : 0, "open cursor count" : 3 }, "transaction" : { "update conflicts" : 0 } }, "nindexes" : 2, "totalIndexSize" : 122880, "indexSizes" : { "_id_" : 61440, "files_id_1_n_1" : 61440 }, "ok" : 1 } </code>
然后去Python中的gridfs中去试试:
<code>logging.info("fsCollection.stats()=%s", fsCollection.stats()) logging.info("fsCollection.totalSize()=%s", fsCollection.totalSize()) </code>
真的是没有:
<code> logging.info("fsCollection.stats()=%s", fsCollection.stats()) AttributeError: 'GridFS' object has no attribute ‘stats' logging.info("fsCollection.totalSize()=%s", fsCollection.totalSize()) AttributeError: 'GridFS' object has no attribute 'totalSize' </code>
【总结】
<code>> db.fs.chunks.stats() { "ns" : "gridfs.fs.chunks", "size" : 487720549, "count" : 1940, "avgObjSize" : 251402, "storageSize" : 825081856, ... </code>
可以看到详细的信息,其中size和storageSize,和:
<code>> db.fs.chunks.totalSize() 825204736 </code>
输出的值,都不太一样。
另外,对于files,也是类似的:
<code>> db.fs.files.stats() { "ns" : "gridfs.fs.files", "size" : 155368, "count" : 171, "avgObjSize" : 908, "storageSize" : 139264, "capped" : false, </code>
和:
<code>> db.fs.files.totalSize() 221184 </code>
转载请注明:在路上 » 【已解决】MongoDB的GridFS的所有文件的总大小