折腾:
【已解决】把本地的音频字幕等数据存储到本地MongoDB数据库中
期间,命令行方式的mongofiles去put get delete delete_id等,已经基本上搞清楚了。
接着就是去用Python代码,通过driver:
MongoDB Drivers and Client Libraries — MongoDB Manual 3.6
的方式,调用API,去保存数据了。
python mongodb gridfs
GridFS Example — PyMongo 3.6.1 documentation
需要先安装:pymongo
gridfs – Tools for working with GridFS — PyMongo 3.6.1 documentation
就是这些API了。
Python 2.6.2 + mongodb 2.0.7 +GridFS 实现图片的存取 – CSDN博客
Python MongoDB Drivers — MongoDB Ecosystem
pymongo 3.6.1 : Python Package Index
mongodb/mongo-python-driver: PyMongo – the Python driver for MongoDB
PyMongo 3.6.1 Documentation — PyMongo 3.6.1 documentation
Installing / Upgrading — PyMongo 3.6.1 documentation
通过:
Python MongoDB Drivers — MongoDB Ecosystem
发现,
对于此处:
<code>➜ 英语资源 mongod --version db version v3.6.3 git version: 9586e557d54ef70f9ca4b43c26892cd55257e1a5 OpenSSL version: OpenSSL 1.0.2o 27 Mar 2018 allocator: system modules: none build environment: distarch: x86_64 target_arch: x86_64 ➜ 英语资源 python -V Python 2.7.13 </code>
来说,最新的3.5的Python Driver(虽然只写了MongoDB 3.4),应该也只支持最新的MongoDB 3.6.3的
Driver Compatibility — MongoDB Ecosystem
3.6的Python Driver,确定是支持的MongoDB 3.6的
另外对于Async Driver的Motor暂时不是很清楚,等用到了再说。
看到:
Motor: Asynchronous Python driver for MongoDB — Motor 1.2.1 documentation
好像是主要是用于异步环境的,支持Tornado or asyncio
另外还有一些工具:
ORM Like Layers:如果需要data validation,associations,其他high-level data modeling functionality,可以考虑用
PyMODM
Humongolus
Ming
MongoEngine
MotorEngine
uMongo
Djongo
Django MongoDB Engine
mango
Django MongoEngine
mongodb_beaker
Log4Mongo
MongoLog
c5t
rod.recipe.mongodb
repoze-what-plugins-mongodb
mongobox
Flask-MongoAlchemy
Flask-MongoKit
Flask-PyMongo
看起来人气最旺
Motor
TxMongo
MongoMock
一些教程和资料:
Tutorial — PyMongo 3.6.1 documentation
Getting Started with MongoDB (Python Edition) — Getting Started With MongoDB 3.6.0
presentations/pycon_2012 at master · behackett/presentations
目前还是继续参考:
Installing / Upgrading — PyMongo 3.6.1 documentation
去试试吧
先去安装:
<code>➜ 英语资源 pip install pymongo Collecting pymongo Downloading pymongo-3.6.1-cp27-cp27m-macosx_10_13_intel.whl (310kB) 100% |████████████████████████████████| 317kB 67kB/s Installing collected packages: pymongo Successfully installed pymongo-3.6.1 ➜ 英语资源 pip install pymongo Requirement already satisfied: pymongo in /usr/local/lib/python2.7/site-packages ➜ 英语资源 pip install --upgrade pymongo Requirement already up-to-date: pymongo in /usr/local/lib/python2.7/site-packages </code>
然后
Tutorial — PyMongo 3.6.1 documentation
然后去Python命令行中测试:
<code>➜ 英语资源 python Python 2.7.13 (default, May 6 2017, 15:08:03) [GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.38)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import pymongo >>> from pymongo import MongoClient >>> client = MongoClient() >>> </code>
mongod中的输出是:
<code>2018-03-29T11:14:13.462+0800 I NETWORK [listener] connection accepted from 127.0.0.1:54229 #51 (2 connections now open) 2018-03-29T11:14:13.466+0800 I NETWORK [conn51] end connection 127.0.0.1:54229 (1 connection now open) 2018-03-29T11:42:26.689+0800 I NETWORK [listener] connection accepted from 127.0.0.1:65262 #52 (2 connections now open) 2018-03-29T11:42:26.690+0800 I NETWORK [conn52] received client metadata from 127.0.0.1:65262 conn: { driver: { name: "PyMongo", version: "3.6.1" }, os: { type: "Darwin", name: "Darwin", architecture: "x86_64", version: "10.13.3" }, platform: "CPython 2.7.13.final.0" } </code>
自己随便试了试:
<code>>>> client.gridfs Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), u'gridfs') >>> client.gridfs.fs.find() <pymongo.cursor.Cursor object at 0x10c041a50> </code>
去看api
然后终于可以找到一些数据了:
<code>>>> client.gridfs.fs.findOne() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python2.7/site-packages/pymongo/collection.py", line 3104, in __call__ self.__name.split(".")[-1]) TypeError: 'Collection' object is not callable. If you meant to call the 'findOne' method on a 'Collection' object it is failing because no such method exists. >>> client.gridfs.fs.find_one() >>> client.gridfs.files.find_one() >>> client.gridfs.fs.files.find_one() {u'contentType': u'audio/mpeg', u'chunkSize': 261120, u'filename': u'Otto the Cat-withMIME.MP3', u'length': 8338105, u'uploadDate': datetime.datetime(2018, 3, 29, 1, 38, 44, 853000), u'_id': ObjectId('5abc43a4a4bc712159a35cd9'), u'md5': u'b7660d833085e9e1a21813e4d74b0cc3'} >>> import pprint >>> pprint.pprint(client.gridfs.fs.files.find_one()) {u'_id': ObjectId('5abc43a4a4bc712159a35cd9'), u'chunkSize': 261120, u'contentType': u'audio/mpeg', u'filename': u'Otto the Cat-withMIME.MP3', u'length': 8338105, u'md5': u'b7660d833085e9e1a21813e4d74b0cc3', u'uploadDate': datetime.datetime(2018, 3, 29, 1, 38, 44, 853000)} </code>
证明了此处可以正常连接MongoDB,接着就可以去写代码,调试保存,读取,和删除了。
换用PyCharm去写代码和调试:
【已解决】用PyCharm写Python的MongoDB代码并调试
接着调试put等操作:
【已解决】如何用PyMongo中的GridFS的put去保存添加文件
然后想办法添加额外参数,包括MIME,其他参数等等
注意到:
gridfs – Tools for working with GridFS — PyMongo 3.6.1 documentation
The gridfs package is an implementation of GridFS on top of pymongo, exposing a file-like interface.
gridfs这个库,是在pymongo之上,实现了GridFS这个协议,提供了类似于文件操作的接口
通过:
class gridfs.GridFS(database, collection=’fs’)¶
所以需要:
import gridfs
from gridfs import GridFS
然后后续使用:
fsCollection = GridFS(gridfsDb)
去创建对应的collection
【已解决】MongoDB通过GridFS的API的put保存文件时添加额外信息
另外去找找,上次和本地调试,保存的文件是一样:
内部是否override,还是新生成文件了
结果是:不是override,是新保存文件的:
<code>{ "_id" : ObjectId("5abc9525a4bc715e187c6d6d"), "contentType" : "application/mpeg", "chunkSize" : 261120, "metadata" : { "keywords" : { "name" : "Lots of Hearts", "series" : "All Aboard Reading", "topic" : "", "contentKeywords" : [ ], "leadingActor" : "", "keywords" : [ "hearts" ] }, "isFiction" : false, "fitAgeStartYear" : 3, "fitAgeEndYear" : 6 }, "filename" : "Lots of Hearts_withContentType.mp3", "length" : 4795707, "uploadDate" : ISODate("2018-03-29T07:26:29.264Z"), "md5" : "955d19f230a5824e0fd5f41bee3dda21" } { "_id" : ObjectId("5abc96dfa4bc715f473f0297"), "contentType" : "application/mpeg", "chunkSize" : 261120, "metadata" : { "keywords" : { "name" : "Lots of Hearts", "series" : "All Aboard Reading", "topic" : "", "contentKeywords" : [ ], "leadingActor" : "", "keywords" : [ "hearts" ] }, "isFiction" : false, "fitAgeStartYear" : 3, "fitAgeEndYear" : 6 }, "filename" : "Lots of Hearts_withContentType.mp3", "length" : 4795707, "uploadDate" : ISODate("2018-03-29T07:33:51.573Z"), "md5" : "955d19f230a5824e0fd5f41bee3dda21" } </code>
然后去删除掉其中一个试试:
【已解决】PyMongo中GridFS的exists始终检测不到文件已存在
另外:
Mongodb GridFS using Python – Abhay PS
也提到了,对于API去保存文件时,没有override,会保存多份同名的内容:
只是id和uploadDate不同而已。
之所以这么做,是为了可以保存同名文件的多个版本。
而想要获取最新版本可以用:
<code>f = fs.get_last_version(filename="mystory.txt") </code>