最新消息:20210816 当前crifan.com域名已被污染,为防止失联,请关注(页面右下角的)公众号

【已解决】去更新绘本查询中的MongoDB中的绘本的title字段数据

MongoDB crifan 970浏览 0评论
问题:
去看了下乱码的原因:
是原始抓取到的数据就出错是乱码了:
Mongo Compass中搜:
{"title": {$regex: "Chicka Chicka .*"}}
找到的:
解决办法:
1. 重新修改爬取代码,重新爬取
优点:批量解决问题,缺点:需要 修改代码+运行脚本+重新更新后台数据 所用时间较长
2.去数据库中修改已发现的个别的乱码
优点:相对省时间 缺点:要发现一个解决一个 不能一次性批量解决
我之前也偶尔发现1个 -》总体上还是很少的
结论:先用方案2解决目前出现的个别问题
目前也没合并兰斯的数据,合并了之后再出现这样的问题再考虑新的解决方案
所以现在去更新MongoDB中的数据
本来想要完整更新整个数据呢,后来想到了:
只需要更新单个元素的title字段即可。
所以思路是:
先去查询出来,再去根据id去更新title
mongodb update field
$set — MongoDB Manual
db.collection.update() — MongoDB Manual
Field Update Operators — MongoDB Manual
How to update a single field in a MongoDB collection for all documents matching a specific criteria
Update field in exact element array in MongoDB – Stack Overflow
先去本地尝试一下,再去在线数据库中操作
> db.main.find({"title": {"$regex": "Chicka Chicka .*"}}).pretty()
搜到了要的:
{
    "_id" : ObjectId("5bd7beecbfaa44fe2c73e73f"),
    "url" : "https://www.scholastic.com/teachers/books/chicka-chicka-1-2-3-by-bill-martin-jr/",
    "title" : "Chicka Chicka 1â¢2â¢3",
    "description" : "This spectacular follow-up to the bestselling Chicka Chicka Boom Boom is the essential book for any child learning to count.\n\n1 told 2 and 2 told 3 \"I'll race you to the top of the apple tree.\"\n\nOne hundred and one numbers climb the apple tree in this bright, rollicking, joyous book for young children. As the numerals pile up and bumblebees threaten, what's the number that saves the day? (Hint: It rhymes with \"hero.\")\n\nRead and count and play and laugh to learn the surprising answer.",
    "coverImgUrl" : "https://www.scholastic.com/content5/media/products/72/9780439731072_mres.jpg",
然后再去确保能搜到
> db.main.find({"_id": ObjectId("5bd7beecbfaa44fe2c73e73f")}).pretty()
{
    "_id" : ObjectId("5bd7beecbfaa44fe2c73e73f"),
    "url" : "https://www.scholastic.com/teachers/books/chicka-chicka-1-2-3-by-bill-martin-jr/",
    "title" : "Chicka Chicka 1â¢2â¢3",
再去想办法更新:
结果郁闷了,更新后只剩title了:
> db.main.update({"_id": ObjectId("5bd7beecbfaa44fe2c73e73f")}, {"title": "Chicka Chicka 1,2,3"})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.main.find({"_id": ObjectId("5bd7beecbfaa44fe2c73e73f")}).pretty()
{
    "_id" : ObjectId("5bd7beecbfaa44fe2c73e73f"),
    "title" : "Chicka Chicka 1,2,3"
}
>
换成另外的$set试试
mongodb set vs update
$set — MongoDB Manual
The $set operator replaces the value of a field with the specified value.
看来就是我要的:更新某个字段(保留其他字段)
MongoDB: Update/Upsert vs Insert – Stack Overflow
Update Operators — MongoDB Manual
db.collection.update() — MongoDB Manual
“Modifies an existing document or documents in a collection. The method can modify specific fields of an existing document or documents or replace an existing document entirely, depending on the update parameter.
By default, the update() method updates a single document.”
可以更新某个字段,也可以更新整个document
默认更新整个document
db.books.update(
   { _id: 1 },
   {
     $inc: { stock: 5 },
     $set: {
       item: "ABC123",
       "info.publisher": "2222",
       tags: [ "software" ],
       "ratings.1": { by: "xyz", rating: 3 }
     }
   }
)
应该用:
db.books.update
然后内部用$set去更新某个字段
【总结】
最后用这个写法就可以了:
> db.main.update({"_id": ObjectId("5bd7beebbfaa44fe2c73e722")}, {$set: {"title": "Chicka Chicka new title"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

> db.main.find({"_id": ObjectId("5bd7beebbfaa44fe2c73e722")}).pretty()
{
    "_id" : ObjectId("5bd7beebbfaa44fe2c73e722"),
    "url" : "https://www.scholastic.com/teachers/books/chicka-chicka-sticka-sticka-by-bill-martin-jr/",
    "title" : "Chicka Chicka new title",
...
成功更新,且只更新title,其他字段不变:
所以再去在线MongoDB中去更新:
[root@xxx-general-01 ~]# mongo storybook --host localhost --port xxx -u storybook -p xxx --authenticationDatabase storybook
MongoDB shell version: 3.2.19
connecting to: localhost:32018/storybook
> show collections
collection
main
scholastic
> db.main.find({"title": {"$regex": "Chicka Chicka .*"}}).pretty()
...
{
        "_id" : ObjectId("5bd7beecbfaa44fe2c73e73f"),
        "url" : "https://www.scholastic.com/teachers/books/chicka-chicka-1-2-3-by-bill-martin-jr/",
        "title" : "Chicka Chicka 1â¢2â¢3"
...

> db.main.update({"_id": ObjectId("5bd7beecbfaa44fe2c73e73f")}, {$set: {"title": "Chicka Chicka 1,2,3"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.main.find({"_id": ObjectId("5bd7beecbfaa44fe2c73e73f")}).pretty()
{
        "_id" : ObjectId("5bd7beecbfaa44fe2c73e73f"),
        "url" : "https://www.scholastic.com/teachers/books/chicka-chicka-1-2-3-by-bill-martin-jr/",
        "title" : "Chicka Chicka 1,2,3",
...
【后记】
继续去找找是否有其他乱码:
db.main.find({"title": {"$regex": ".*â¢.*"}})
是空。
但是:
> db.main.find({"title": {"$regex": ".*â.*"}})
> db.main.find({"title": {"$regex": ".*â.*"}})
{ "_id" : ObjectId("5bd7bd65bfaa44fe2c738260"), "url" : "https://www.scholastic.com/teachers/books/when-a-line-bends--a-shape-begins-by-rhonda-gowler-greene/", "title" : "When a Line Bends⦠A Shape Begins", "description" ...
还真能搜到。
> db.main.find({"title": {"$regex": ".*â.*"}}).length()
6
共有6个。
> db.main.find({"title": {"$regex": ".*¢.*"}})
>
没有。
对于:
> db.main.find({"title": {"$regex": ".*â.*"}})
{ "_id" : ObjectId("5bd7bd65bfaa44fe2c738260"), "url" : "https://www.scholastic.com/teachers/books/when-a-line-bends--a-shape-begins-by-rhonda-gowler-greene/", "title" : "When a Line Bends⦠A Shape Begins", "description" : ...
{ "_id" : ObjectId("5bd7bd9cbfaa44fe2c7393e7"), "url" : "https://www.scholastic.com/teachers/books/reading-response-trifolds-for-40-popular-nonfiction-books-grade/", "title" : "Reading Response Trifolds for 40 Popular Nonfiction Books: Grades 2â3", ...
{ "_id" : ObjectId("5bd7be00bfaa44fe2c73b1f7"), "url" : "https://www.scholastic.com/teachers/books/it-s-all-about-us-especially-me--by-karen-phillips/", "title" : "It's All About Us (â¦Especially Me!)", ...
{ "_id" : ObjectId("5bd7be17bfaa44fe2c73b6c0"), "url" : "https://www.scholastic.com/teachers/books/i-heart-band-by-michelle-schusterman/", "title" : "I ⥠Band!", ...
{ "_id" : ObjectId("5bd7bed5bfaa44fe2c73e0b3"), "url" : "https://www.scholastic.com/teachers/books/hi-lo-passages-to-build-comprehension-grades-56-by-michael-prie/", "title" : "Hi-Lo Passages to Build Comprehension: Grades 5â6", ...
{ "_id" : ObjectId("5bd7bf69bfaa44fe2c740c17"), "url" : "https://www.scholastic.com/teachers/books/50-skill-building-pyramid-puzzles-math-grades-4-6-by-immacula/", "title" : "50 Skill-Building Pyramid Puzzles: Math: Grades 4â6", ...
分别更新这6个的title:
> db.main.update({"_id": ObjectId("5bd7bd65bfaa44fe2c738260")}, {$set: {"title": "When a Line Bends… A Shape Begins"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.main.update({"_id": ObjectId("5bd7bd9cbfaa44fe2c7393e7")}, {$set: {"title": "Reading Response Trifolds for 40 Popular Nonfiction Books: Grades 2–3"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.main.update({"_id": ObjectId("5bd7be00bfaa44fe2c73b1f7")}, {$set: {"title": "It's All About Us (…Especially Me!)"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.main.update({"_id": ObjectId("5bd7be17bfaa44fe2c73b6c0")}, {$set: {"title": "I ♥ Band!"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.main.update({"_id": ObjectId("5bd7bed5bfaa44fe2c73e0b3")}, {$set: {"title": "Hi-Lo Passages to Build Comprehension: Grades 5–6"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.main.update({"_id": ObjectId("5bd7bf69bfaa44fe2c740c17")}, {$set: {"title": "50 Skill-Building Pyramid Puzzles: Math: Grades 4–6"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
即可。
对于上面的:
… -> â¦
– -> â
♥ -> â¥
再去搜搜其他的:
> db.main.find({"title": {"$regex": ".*¦.*"}})
> db.main.find({"title": {"$regex": ".*¥.*"}})
>
目前都没了。
另外抽空再去:
【记录】把在线的dev的MongoDB备份后恢复到本地

转载请注明:在路上 » 【已解决】去更新绘本查询中的MongoDB中的绘本的title字段数据

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
89 queries in 0.212 seconds, using 22.17MB memory