最新消息:20210816 当前crifan.com域名已被污染,为防止失联,请关注(页面右下角的)公众号

【已解决】用Python去连接本地mongoDB去用GridFS保存文件

MongoDB crifan 3591浏览 0评论

折腾:

【已解决】把本地的音频字幕等数据存储到本地MongoDB数据库中

期间,命令行方式的mongofiles去put get delete delete_id等,已经基本上搞清楚了。

接着就是去用Python代码,通过driver:

MongoDB Drivers and Client Libraries — MongoDB Manual 3.6

的方式,调用API,去保存数据了。

python mongodb gridfs

GridFS Example — PyMongo 3.6.1 documentation

需要先安装:pymongo

gridfs – Tools for working with GridFS — PyMongo 3.6.1 documentation

就是这些API了。

Python 2.6.2 + mongodb 2.0.7 +GridFS 实现图片的存取 – CSDN博客

Python MongoDB Drivers — MongoDB Ecosystem

pymongo 3.6.1 : Python Package Index

mongodb/mongo-python-driver: PyMongo – the Python driver for MongoDB

PyMongo 3.6.1 Documentation — PyMongo 3.6.1 documentation

Installing / Upgrading — PyMongo 3.6.1 documentation

通过:

Python MongoDB Drivers — MongoDB Ecosystem

发现,

对于此处:

<code>➜  英语资源 mongod --version
db version v3.6.3
git version: 9586e557d54ef70f9ca4b43c26892cd55257e1a5
OpenSSL version: OpenSSL 1.0.2o  27 Mar 2018
allocator: system
modules: none
build environment:
    distarch: x86_64
    target_arch: x86_64
➜  英语资源 python -V
Python 2.7.13
</code>

来说,最新的3.5的Python Driver(虽然只写了MongoDB 3.4),应该也只支持最新的MongoDB 3.6.3的

Driver Compatibility — MongoDB Ecosystem

3.6的Python Driver,确定是支持的MongoDB 3.6的

另外对于Async Driver的Motor暂时不是很清楚,等用到了再说。

看到:

Motor: Asynchronous Python driver for MongoDB — Motor 1.2.1 documentation

好像是主要是用于异步环境的,支持Tornado or asyncio

另外还有一些工具:

  • ORM Like Layers:如果需要data validation,associations,其他high-level data modeling functionality,可以考虑用

  • Framework Tools

    • Djongo

    • Django MongoDB Engine

    • mango

    • Django MongoEngine

    • mongodb_beaker

    • Log4Mongo

    • MongoLog

    • c5t

    • rod.recipe.mongodb

    • repoze-what-plugins-mongodb

    • mongobox

    • Flask-MongoAlchemy

    • Flask-MongoKit

    • Flask-PyMongo

      • 看起来人气最旺

  • Alternative Drivers

    • Motor

    • TxMongo

    • MongoMock

一些教程和资料:

Tutorial — PyMongo 3.6.1 documentation

Getting Started with MongoDB (Python Edition) — Getting Started With MongoDB 3.6.0

presentations/pycon_2012 at master · behackett/presentations

目前还是继续参考:

Installing / Upgrading — PyMongo 3.6.1 documentation

去试试吧

先去安装:

<code>➜  英语资源 pip install pymongo
Collecting pymongo
  Downloading pymongo-3.6.1-cp27-cp27m-macosx_10_13_intel.whl (310kB)
    100% |████████████████████████████████| 317kB 67kB/s
Installing collected packages: pymongo
Successfully installed pymongo-3.6.1
➜  英语资源 pip install pymongo
Requirement already satisfied: pymongo in /usr/local/lib/python2.7/site-packages
➜  英语资源 pip install --upgrade pymongo
Requirement already up-to-date: pymongo in /usr/local/lib/python2.7/site-packages
</code>

然后

Tutorial — PyMongo 3.6.1 documentation

然后去Python命令行中测试:

<code>➜  英语资源 python
Python 2.7.13 (default, May  6 2017, 15:08:03)
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
&gt;&gt;&gt; import pymongo
&gt;&gt;&gt; from pymongo import MongoClient
&gt;&gt;&gt; client = MongoClient()
&gt;&gt;&gt;
</code>

mongod中的输出是:

<code>2018-03-29T11:14:13.462+0800 I NETWORK  [listener] connection accepted from 127.0.0.1:54229 #51 (2 connections now open)
2018-03-29T11:14:13.466+0800 I NETWORK  [conn51] end connection 127.0.0.1:54229 (1 connection now open)
2018-03-29T11:42:26.689+0800 I NETWORK  [listener] connection accepted from 127.0.0.1:65262 #52 (2 connections now open)
2018-03-29T11:42:26.690+0800 I NETWORK  [conn52] received client metadata from 127.0.0.1:65262 conn: { driver: { name: "PyMongo", version: "3.6.1" }, os: { type: "Darwin", name: "Darwin", architecture: "x86_64", version: "10.13.3" }, platform: "CPython 2.7.13.final.0" }
</code>

自己随便试了试:

<code>&gt;&gt;&gt; client.gridfs
Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), u'gridfs')
&gt;&gt;&gt; client.gridfs.fs.find()
&lt;pymongo.cursor.Cursor object at 0x10c041a50&gt;
</code>

去看api

然后终于可以找到一些数据了:

<code>&gt;&gt;&gt; client.gridfs.fs.findOne()
Traceback (most recent call last):
  File "&lt;stdin&gt;", line 1, in &lt;module&gt;
  File "/usr/local/lib/python2.7/site-packages/pymongo/collection.py", line 3104, in __call__
    self.__name.split(".")[-1])
TypeError: 'Collection' object is not callable. If you meant to call the 'findOne' method on a 'Collection' object it is failing because no such method exists.
&gt;&gt;&gt; client.gridfs.fs.find_one()
&gt;&gt;&gt; client.gridfs.files.find_one()
&gt;&gt;&gt; client.gridfs.fs.files.find_one()
{u'contentType': u'audio/mpeg', u'chunkSize': 261120, u'filename': u'Otto the Cat-withMIME.MP3', u'length': 8338105, u'uploadDate': datetime.datetime(2018, 3, 29, 1, 38, 44, 853000), u'_id': ObjectId('5abc43a4a4bc712159a35cd9'), u'md5': u'b7660d833085e9e1a21813e4d74b0cc3'}
&gt;&gt;&gt; import pprint
&gt;&gt;&gt; pprint.pprint(client.gridfs.fs.files.find_one())
{u'_id': ObjectId('5abc43a4a4bc712159a35cd9'),
 u'chunkSize': 261120,
 u'contentType': u'audio/mpeg',
 u'filename': u'Otto the Cat-withMIME.MP3',
 u'length': 8338105,
 u'md5': u'b7660d833085e9e1a21813e4d74b0cc3',
 u'uploadDate': datetime.datetime(2018, 3, 29, 1, 38, 44, 853000)}
</code>

证明了此处可以正常连接MongoDB,接着就可以去写代码,调试保存,读取,和删除了。

换用PyCharm去写代码和调试:

【已解决】用PyCharm写Python的MongoDB代码并调试

接着调试put等操作:

【已解决】如何用PyMongo中的GridFS的put去保存添加文件

然后想办法添加额外参数,包括MIME,其他参数等等

注意到:

gridfs – Tools for working with GridFS — PyMongo 3.6.1 documentation

The gridfs package is an implementation of GridFS on top of pymongo, exposing a file-like interface.

gridfs这个库,是在pymongo之上,实现了GridFS这个协议,提供了类似于文件操作的接口

通过:

class gridfs.GridFS(database, collection=’fs’)¶

所以需要:

import gridfs

from gridfs import GridFS

然后后续使用:

fsCollection = GridFS(gridfsDb)

去创建对应的collection

【已解决】MongoDB通过GridFS的API的put保存文件时添加额外信息

另外去找找,上次和本地调试,保存的文件是一样:

内部是否override,还是新生成文件了

结果是:不是override,是新保存文件的:

<code>{
    "_id" : ObjectId("5abc9525a4bc715e187c6d6d"),
    "contentType" : "application/mpeg",
    "chunkSize" : 261120,
    "metadata" : {
        "keywords" : {
            "name" : "Lots of Hearts",
            "series" : "All Aboard Reading",
            "topic" : "",
            "contentKeywords" : [ ],
            "leadingActor" : "",
            "keywords" : [
                "hearts"
            ]
        },
        "isFiction" : false,
        "fitAgeStartYear" : 3,
        "fitAgeEndYear" : 6
    },
    "filename" : "Lots of Hearts_withContentType.mp3",
    "length" : 4795707,
    "uploadDate" : ISODate("2018-03-29T07:26:29.264Z"),
    "md5" : "955d19f230a5824e0fd5f41bee3dda21"
}
{
    "_id" : ObjectId("5abc96dfa4bc715f473f0297"),
    "contentType" : "application/mpeg",
    "chunkSize" : 261120,
    "metadata" : {
        "keywords" : {
            "name" : "Lots of Hearts",
            "series" : "All Aboard Reading",
            "topic" : "",
            "contentKeywords" : [ ],
            "leadingActor" : "",
            "keywords" : [
                "hearts"
            ]
        },
        "isFiction" : false,
        "fitAgeStartYear" : 3,
        "fitAgeEndYear" : 6
    },
    "filename" : "Lots of Hearts_withContentType.mp3",
    "length" : 4795707,
    "uploadDate" : ISODate("2018-03-29T07:33:51.573Z"),
    "md5" : "955d19f230a5824e0fd5f41bee3dda21"
}
</code>

然后去删除掉其中一个试试:

【已解决】PyMongo中GridFS的exists始终检测不到文件已存在

另外:

Mongodb GridFS using Python – Abhay PS

也提到了,对于API去保存文件时,没有override,会保存多份同名的内容:

只是id和uploadDate不同而已。

之所以这么做,是为了可以保存同名文件的多个版本。

而想要获取最新版本可以用:

<code>f = fs.get_last_version(filename="mystory.txt")
</code>

转载请注明:在路上 » 【已解决】用Python去连接本地mongoDB去用GridFS保存文件

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
82 queries in 0.206 seconds, using 22.16MB memory