折腾:
【已解决】用Python去连接本地mongoDB去用GridFS保存文件
期间,已经可以用GridFS的API中的put去保存文件了。
现在需要去研究,如何在put时保存额外的信息。
去试试添加额外参数:
<code>audioFileId = fsCollection.put( audioFp, filename="Lots of Hearts_withContentType.mp3", content_type="application/mpeg", metadata={ "keywords": { "series": "All Aboard Reading", "name": "Lots of Hearts", "keywords": [ "hearts" ], "leadingActor": "", "topic": "", "contentKeywords": [] }, "fitAgeStartYear": 3, "fitAgeEndYear": 6, "isFiction": False }) </code>
结果:
至少是正常运行的:
audioFileId=5abc9525a4bc715e187c6d6d
去看看结果:
<code>{ "_id" : ObjectId("5abc9525a4bc715e187c6d6d"), "contentType" : "application/mpeg", "chunkSize" : 261120, "metadata" : { "keywords" : { "name" : "Lots of Hearts", "series" : "All Aboard Reading", "topic" : "", "contentKeywords" : [ ], "leadingActor" : "", "keywords" : [ "hearts" ] }, "isFiction" : false, "fitAgeStartYear" : 3, "fitAgeEndYear" : 6 }, "filename" : "Lots of Hearts_withContentType.mp3", "length" : 4795707, "uploadDate" : ISODate("2018-03-29T07:26:29.264Z"), "md5" : "955d19f230a5824e0fd5f41bee3dda21" } </code>
果然是可以的,可以把字典或者其他任何信息,直接放入metadata中去的。
然后看看如何读取metadata出来。
“get(file_id, session=None)
Get a file from GridFS by “_id”.
Returns an instance of GridOut, which provides a file-like interface for reading.
Parameters:
* file_id: “_id” of the file to get
* session (optional): a ClientSession
Changed in version 3.6: Added session parameter.”
此处get是返回的GridOut对象
http://api.mongodb.com/python/current/api/gridfs/grid_file.html#gridfs.grid_file.GridOut
class gridfs.grid_file.GridOut(root_collection, file_id=None, file_document=None, session=None)¶
而里面说了:
metadata¶
Metadata attached to this file.
This attribute is read-only.
所以应该可以直接读取出来的。
看到:
“Changed in version 3.0: Creating a GridOut does not immediately retrieve the file metadata from the server. Metadata is fetched when first needed.”
感觉是:
metadata是属于lazy load懒加载
如果没有用到,则不会去读取,第一次用到,才会去读取
但是我们写代码,理论上不需要关心。
去试试:
写代码期间,PyCharm也可以检测到GridOut了:
对应的,fs的collection的put也能看到了:
用代码:
<code>audioFileId = fsCollection.put( audioFp, filename="Lots of Hearts_withContentType.mp3", content_type="application/mpeg", metadata={ "keywords": { "series": "All Aboard Reading", "name": "Lots of Hearts", "keywords": [ "hearts" ], "leadingActor": "", "topic": "", "contentKeywords": [] }, "fitAgeStartYear": 3, "fitAgeEndYear": 6, "isFiction": False }) logging.info("audioFileId=%s", audioFileId) readOutAudioFile = fsCollection.get(audioFileId) logging.info("readOutAudioFile=%s", readOutAudioFile) audioFileMedata = readOutAudioFile.metadata logging.info("audioFileMedata=%s", audioFileMedata) </code>
调试期间,是可以看到metadata的:
可以获取metadata的:
<code>2018/03/29 03:33:52 LINE 83 INFO audioFileId=5abc96dfa4bc715f473f0297 2018/03/29 03:36:03 LINE 86 INFO readOutAudioFile=<gridfs.grid_file.GridOut object at 0x110ec7c90> 2018/03/29 03:36:38 LINE 88 INFO audioFileMedata={u'keywords': {u'name': u'Lots of Hearts', u'series': u'All Aboard Reading', u'topic': u'', u'keywords': [u'hearts'], u'leadingActor': u'', u'contentKeywords': []}, u'isFiction': False, u'fitAgeStartYear': 3, u'fitAgeEndYear': 6} </code>
【总结】
此处通过pymongo的gridfs的api中的put保存文件时,想要传递其他额外信息时,直接存在metadata中即可,比如
<code>with open(curAudioFullFilename) as audioFp : audioFileId = fsCollection.put( audioFp, filename="Lots of Hearts_withContentType.mp3", content_type=fileMimeType, metadata={ "keywords": { "series": "All Aboard Reading", "name": "Lots of Hearts", "keywords": [ "hearts" ], "leadingActor": "", "topic": "", "contentKeywords": [] }, "fitAgeStartYear": 3, "fitAgeEndYear": 6, "isFiction": False }) </code>
保存后的效果类似于:
<code>{ "_id" : ObjectId("5abc997ea4bc71611bd37613"), "contentType" : "application/mpeg", "chunkSize" : 261120, "metadata" : { "keywords" : { "name" : "Lots of Hearts", "series" : "All Aboard Reading", "topic" : "", "contentKeywords" : [ ], "leadingActor" : "", "keywords" : [ "hearts" ] }, "isFiction" : false, "fitAgeStartYear" : 3, "fitAgeEndYear" : 6 }, "filename" : "Lots of Hearts_withContentType.mp3", "length" : 4795707, "uploadDate" : ISODate("2018-03-29T07:45:02.535Z"), "md5" : "955d19f230a5824e0fd5f41bee3dda21" } </code>