折腾:
【未解决】PySpider中把结果保存到MongoDB数据库中
期间,在PySpider中保存数据到MongoDB之前,需要先去本地调试,写好可以用于保存数据的代码。
Mac本地先去运行Mongod:
mongod
然后打开图形工具便于查看数据:
然后去调试代码,创建一个新的collection=database
然后尝试去把一段json:
{ 'authors': ['Sharon Creech'], 'coverImgUrl': ' https://www.scholastic.com/content5/media/products/66/9780439569866_mres.jpg ', 'description': "I guess it does\nlook like a poem\nwhen you see it\ntyped up\nlike that.\n\nJack hates poetry. Only girls write it and every time he tries to, his brain feels empty. But his teacher, Ms. Stretchberry, won't stop giving her class poetry assignments, and Jack can't avoid them. But then something amazing happens. The more he writes, the more he learns he does have something to say.\n\nWith a fresh and deceptively simple style, acclaimed author Sharon Creech tells a story with enormous heart. Written as a series of free-verse poems from Jack's point of view, Love That Dog shows how one boy finds his own voice with the help of a teacher, a writer, a pencil, some yellow paper, and of course, a dog.", 'draLevel': '50', 'genre': 'Fiction', 'gradeLevelEquivalent': '', 'grades': ['6-8'], 'guidedReading': 'T', 'illustrators': [], 'isbn13': '9780439569866', 'lexileMeasure': '1010L', 'originUrl': ' https://www.scholastic.com/content/scholastic/books2/love-that-dog-by-sharon-creech ', 'pages': 112, 'recommendations': [{ 'title': "Girls' Life Ultimate Guide to Surviving Middle School", 'url': ' https://www.scholastic.com/content/scholastic/books2/girls-rsquo-life-ultimate-guide-to-surviving-middle-school-by-b ' }, { 'title': "Girls' Life Ultimate Guide To Surviving Middle School", 'url': ' https://www.scholastic.com/content/scholastic/books2/girls-rsquo-life-ultimate-guide-to-surviving-middle-school-by-b ' }, { 'title': 'The Date to Save', 'url': ' https://www.scholastic.com/content/scholastic/books2/date-to-save-the-by-stephanie-kate-strohm ' } ], 'seriesName': '', 'seriesNumber': 0, 'tags': ['Poetry Writing', 'School Life'], 'title': 'Love That Dog', 'url': ' https://www.scholastic.com/teachers/books/love-that-dog-by-sharon-creech/ ' }
保存进去新数据库:ScholasticStorybook
pyspider save mongodb
“yes, you can also override on_result in script instead of override ResultWorker to store the results to mongodb
If you want the on_result change apply to every project, it’s better to override ResultWorker.”
所以是可以通过:
重写on_result,去保存数据的,而不需要ResultWorker了。
如果想要适用于所有的项目,那么最好用ResultWorker
-》此处不需要适用于所有项目,所以直接复写on_result最好
此处确保mac本地python3中已安装了pymongo:
➜ crawler_projects git:(master) ✗ pip3 install pymongo Requirement already satisfied: pymongo in /usr/local/lib/python3.6/site-packages (3.6.1) You are using pip version 10.0.1, however version 18.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command.
然后用代码:
from urllib.parse import quote_plus from pymongo import MongoClient def generateMongoUri(host=None, port=None, isUseAuth=False, username=None, password=None, authSource=None, authMechanism=None): """"generate mongodb uri""" mongodbUri = "" if not host: # host = "127.0.0.0" host = "localhost" if not port: port = 27017 mongodbUri = "mongodb://%s:%s" % ( host, \ port ) # ' mongodb://localhost:27017 ' # ' mongodb://xxx:27017 ' if isUseAuth: mongodbUri = "mongodb://%s:%s@%s:%s" % ( quote_plus(username), \ quote_plus(password), \ host, \ port \ ) print(mongodbUri) if authSource: mongodbUri = mongodbUri + ("/%s" % authSource) print("mongodbUri=%s" % mongodbUri) if authMechanism: mongodbUri = mongodbUri + ("?authMechanism=%s" % authMechanism) print("mongodbUri=%s" % mongodbUri) print("return mongodbUri=%s" % mongodbUri) # mongodb://username:quoted_password@host:port/authSource?authMechanism=authMechanism # mongodb://localhost:27017 return mongodbUri def createMongoClient(): mongoUri = generateMongoUri() print("mongoUri=%s" % mongoUri) client = MongoClient(mongoUri) print("client=%s" % client) # client=MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True) return client class ResultMongo(object): def __init__(self): print("ResultMongo __init__") self.client = createMongoClient() print("self.client=%s" % self.client) self.db = self.client[MONGODB_DB_NAME] print("self.db=%s" % self.db) # self.db=Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'Scholastic') self.collection = self.db[MONGODB_COLLECTION_NAME] print("self.collection=%s" % self.collection) # self.collection=Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'Scholastic'), 'Storybook') def __del__(self): print("ResultMongo __del__") self.client.close() def on_result(self, result): """save result to mongodb""" print("ResultMongo on_result: result=%s" % result) if result: insertOk = self.collection.insert(result) print("insertOk=%s" % insertOk) # insertOk=5bc45fad7f4d3847b78e8c69 def debugMongoResult(): mongo = ResultMongo() print("mongo=%s" % mongo) # mongo=<__main__.ResultMongo object at 0x10b24dcf8> dataDict = { 'authors': ['Sharon Creech'], 'coverImgUrl': ' https://www.scholastic.com/content5/media/products/66/9780439569866_mres.jpg ', 'description': "I guess it does\nlook like a poem\nwhen you see it\ntyped up\nlike that.\n\nJack hates poetry. Only girls write it and every time he tries to, his brain feels empty. But his teacher, Ms. Stretchberry, won't stop giving her class poetry assignments, and Jack can't avoid them. But then something amazing happens. The more he writes, the more he learns he does have something to say.\n\nWith a fresh and deceptively simple style, acclaimed author Sharon Creech tells a story with enormous heart. Written as a series of free-verse poems from Jack's point of view, Love That Dog shows how one boy finds his own voice with the help of a teacher, a writer, a pencil, some yellow paper, and of course, a dog.", 'draLevel': '50', 'genre': 'Fiction', 'gradeLevelEquivalent': '', 'grades': ['6-8'], 'guidedReading': 'T', 'illustrators': [], 'isbn13': '9780439569866', 'lexileMeasure': '1010L', 'originUrl': ' https://www.scholastic.com/content/scholastic/books2/love-that-dog-by-sharon-creech ', 'pages': 112, 'recommendations': [{ 'title': "Girls' Life Ultimate Guide to Surviving Middle School", 'url': ' https://www.scholastic.com/content/scholastic/books2/girls-rsquo-life-ultimate-guide-to-surviving-middle-school-by-b ' }, { 'title': "Girls' Life Ultimate Guide To Surviving Middle School", 'url': ' https://www.scholastic.com/content/scholastic/books2/girls-rsquo-life-ultimate-guide-to-surviving-middle-school-by-b ' }, { 'title': 'The Date to Save', 'url': ' https://www.scholastic.com/content/scholastic/books2/date-to-save-the-by-stephanie-kate-strohm ' } ], 'seriesName': '', 'seriesNumber': 0, 'tags': ['Poetry Writing', 'School Life'], 'title': 'Love That Dog', 'url': ' https://www.scholastic.com/teachers/books/love-that-dog-by-sharon-creech/ ' } mongo.on_result(dataDict) if __name__ == "__main__": debugMongoResult()
调试输出:
然后对应的本地mongo中有数据了:
至此,算是基本上实现了:
本地mongodb中insert插入json数据。
转载请注明:在路上 » 【已解决】Mac中保存json数据到本地MongoDB