最新消息:20210816 当前crifan.com域名已被污染,为防止失联,请关注(页面右下角的)公众号

【已解决】PySpider项目迁移到别的电脑重新继续运行

pyspider crifan 1100浏览 0评论
现有一个PySpider的项目,已经运行了一段时间,爬取了一些数据:
对应MongoDB中也保存了对应的数据:
现在希望是:
整体迁移PySpider的环境过去:
希望可以继续爬取,断点续传。
现在能想到的是:
先要去把目标mac中的MongoDB搭建期间,
把源mac中MongoDB数据导出来,再导入目标mac的mongodb中。
然后再去重建目标mac中pipenv的虚拟环境,安装好库
然后把源环境中PySpider的data目录,整体移动过去
至此,再去月目标环境中继续运行,希望应该可以继续恢复运行
-》只要PySpider中data中的db文件里保存的数据,都是相对路径,理论上应该就可以的。
现在先去:
【已解决】Mac中已安装MongoDB但运行mongod出错:exception in initAndListen: NonExistentPath: Data directory /data/db not found
然后再去:
源电脑:导出MongoDB数据
参考:
【已解决】MongoDB中用导出本地数据再用导入到在线数据库
去操作:
➜  mongodb_migration git:(master) mongodump -d storybook -o .
2018-11-26T11:58:21.944+0800    writing storybook.scholastic to
2018-11-26T11:58:21.946+0800    writing storybook.lexile to
2018-11-26T11:58:21.946+0800    writing storybook.main to
2018-11-26T11:58:23.101+0800    done dumping storybook.lexile (29911 documents)
2018-11-26T11:58:23.353+0800    done dumping storybook.scholastic (51785 documents)
2018-11-26T11:58:24.451+0800    done dumping storybook.main (51785 documents)
目标电脑:导入MongoDB数据
拷贝数据过来后:
去导入:
macdeMacBook-Pro:mongodb_migration mac$ pwd
/Users/mac/working/dev_root/xxx/projects/crawler_projects/crawler_fablexile_book/debug/mongodb_migration
macdeMacBook-Pro:mongodb_migration mac$ ls -lha
total 102928
drwxr-xr-x  5 mac  staff   160B 11 26 13:32 .
drwxr-xr-x  3 mac  staff    96B 11 26 13:31 ..
-rw-r--r--@ 1 mac  staff   6.0K 11 26 13:32 .DS_Store
-rw-r--r--@ 1 mac  staff    50M 11 25 19:59 mongodb_storybook_20181126.zip
drwxr-xr-x@ 8 mac  staff   256B 11 25 19:58 storybook
macdeMacBook-Pro:mongodb_migration mac$ mongorestore -d storybook ./storybook
2018-11-26T13:33:25.550-0800    the --db and --collection args should only be used when restoring from a BSON file. Other uses are deprecated and will not exist in the future; use --nsInclude instead
2018-11-26T13:33:25.550-0800    building a list of collections to restore from storybook dir
2018-11-26T13:33:25.552-0800    reading metadata for storybook.main from storybook/main.metadata.json
2018-11-26T13:33:25.553-0800    reading metadata for storybook.scholastic from storybook/scholastic.metadata.json
2018-11-26T13:33:25.553-0800    reading metadata for storybook.lexile from storybook/lexile.metadata.json
2018-11-26T13:33:25.687-0800    restoring storybook.main from storybook/main.bson
2018-11-26T13:33:25.826-0800    restoring storybook.scholastic from storybook/scholastic.bson
2018-11-26T13:33:25.950-0800    restoring storybook.lexile from storybook/lexile.bson
2018-11-26T13:33:27.512-0800    no indexes to restore
2018-11-26T13:33:27.512-0800    finished restoring storybook.lexile (29911 documents)
2018-11-26T13:33:28.407-0800    no indexes to restore
2018-11-26T13:33:28.407-0800    finished restoring storybook.scholastic (51785 documents)
2018-11-26T13:33:28.547-0800    [###################.....]  storybook.main  87.5MB/106MB  (82.6%)
2018-11-26T13:33:29.134-0800    [########################]  storybook.main  106MB/106MB  (100.0%)
2018-11-26T13:33:29.134-0800    no indexes to restore
2018-11-26T13:33:29.134-0800    finished restoring storybook.main (51785 documents)
2018-11-26T13:33:29.134-0800    done
然后去用工具看看数据是否导入:
去:
MongoDB Download Center | MongoDB
下载和安装:MongoDB Compass
然后打开本地MongoDB,确认数据是对的:
然后目标文件拷贝到了data目录:
然后再去重建pipenv环境:
macdeMacBook-Pro:projects_git mac$ cd /Users/mac/working/dev_root/xxx/projects_git/crawler_projects/crawler_fablexile_book
macdeMacBook-Pro:crawler_fablexile_book mac$ pwd
/Users/mac/working/dev_root/xxx/projects_git/crawler_projects/crawler_fablexile_book
macdeMacBook-Pro:crawler_fablexile_book mac$ ls -l
total 56
-rw-r--r--  1 mac  staff  18272 11 26 13:46 FabLexileBook.py
-rw-r--r--  1 mac  staff    276 11 26 13:46 Pipfile
-rw-r--r--  1 mac  staff   3111 11 26 13:46 README.md
drwxr-xr-x  3 mac  staff     96 11 26 13:46 tools
macdeMacBook-Pro:crawler_fablexile_book mac$ cd /Users/mac/working/dev_root/xxx/projects_git/crawler_projects/crawler_fablexile_book
macdeMacBook-Pro:crawler_fablexile_book mac$ pipenv install --skip-lock
Creating a virtualenv for this project…
Pipfile: /Users/mac/working/dev_root/xxx/projects_git/crawler_projects/crawler_fablexile_book/Pipfile
Using /Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 (3.6.7) to create virtualenv…
⠹Running virtualenv with interpreter /Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6
Using base prefix '/Library/Frameworks/Python.framework/Versions/3.6'
New python executable in /Users/mac/.local/share/virtualenvs/crawler_fablexile_book-4ZfM-yMK/bin/python3.6
Also creating executable in /Users/mac/.local/share/virtualenvs/crawler_fablexile_book-4ZfM-yMK/bin/python
Installing setuptools, pip, wheel...done.

Virtualenv location: /Users/mac/.local/share/virtualenvs/crawler_fablexile_book-4ZfM-yMK
Installing dependencies from Pipfile…
  🐍   ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 5/5 — 00:01:09
To activate this project's virtualenv, run pipenv shell.
Alternatively, run a command inside the virtualenv with pipenv run.
macdeMacBook-Pro:crawler_fablexile_book mac$ pipenv shell
Launching subshell in virtual environment…
bash-3.2$  . /Users/mac/.local/share/virtualenvs/crawler_fablexile_book-4ZfM-yMK/bin/activate
(crawler_fablexile_book) bash-3.2$ which python
/Users/mac/.local/share/virtualenvs/crawler_fablexile_book-4ZfM-yMK/bin/python
(crawler_fablexile_book) bash-3.2$ python --version
Python 3.6.7
然后就可以去试试:
运行pyspider,看看能否继续恢复运行了:
(crawler_fablexile_book) bash-3.2$ pyspider
[W 181126 13:52:40 run:413] phantomjs not found, continue running without it.
[I 181126 13:52:42 result_worker:49] result_worker starting...
[I 181126 13:52:42 processor:211] processor starting...
^C[I 181126 13:52:42 result_worker:66] result_worker exiting...
[I 181126 13:52:42 processor:229] processor exiting...
结果找不到phantomjs,所以去下载和安装
参考自己之前的:
【已解决】Mac中安装phantomjs
去安装:
brew tap homebrew/cask
brew cask install phantomjs
结果下载phantomjs却花了好半天时间,最后ss中换了sg的节点,才能继续下载:
xxx-Mac-2013-Late:~ mac$ brew cask install phantomjs
==> Caveats
phantomjs has been officially discontinued upstream.
It may stop working correctly (or at all) in recent versions of macOS.


==> Satisfying dependencies
==> Downloading 
https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-macosx.zip
==> Downloading from 
https://bbuseruploads.s3.amazonaws.com/fd96ed93-2b32-46a7-9d2b-ecbc0988516a/downloads/8543ae7d-9ac7-43d3-9052-537d63f16d66/phantomjs-2.1.1-
#                                                                          1.7%^C
xxx-Mac-2013-Late:~ mac$
xxx-Mac-2013-Late:~ mac$ brew cask install phantomjs
Updating Homebrew...
==> Caveats
phantomjs has been officially discontinued upstream.
It may stop working correctly (or at all) in recent versions of macOS.


==> Satisfying dependencies
==> Downloading 
https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-macosx.zip
==> Downloading from 
https://bbuseruploads.s3.amazonaws.com/fd96ed93-2b32-46a7-9d2b-ecbc0988516a/downloads/8543ae7d-9ac7-43d3-9052-537d63f16d66/phantomjs-2.1.1-
######################################################################## 100.0%
==> Verifying SHA-256 checksum for Cask 'phantomjs'.
==> Installing Cask phantomjs
==> Creating Caskroom at /usr/local/Caskroom
==> We'll set permissions properly so we won't need sudo in the future.
Password:
Sorry, try again.
Password:
==> Linking Binary 'phantomjs' to '/usr/local/bin/phantomjs'.
🍺  phantomjs was successfully installed!
xxx-Mac-2013-Late:~ mac$ which phantomjs
/usr/local/bin/phantomjs
然后运行:
pyspider
再去设置状态为RUNNING:
就可以继续运行了。
目前看起来,下载速度还不错:

转载请注明:在路上 » 【已解决】PySpider项目迁移到别的电脑重新继续运行

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
85 queries in 0.204 seconds, using 22.10MB memory