现有一个PySpider的项目,已经运行了一段时间,爬取了一些数据:

对应MongoDB中也保存了对应的数据:

现在希望是:
整体迁移PySpider的环境过去:
希望可以继续爬取,断点续传。
现在能想到的是:
先要去把目标mac中的MongoDB搭建期间,
把源mac中MongoDB数据导出来,再导入目标mac的mongodb中。
然后再去重建目标mac中pipenv的虚拟环境,安装好库
然后把源环境中PySpider的data目录,整体移动过去
至此,再去月目标环境中继续运行,希望应该可以继续恢复运行
-》只要PySpider中data中的db文件里保存的数据,都是相对路径,理论上应该就可以的。
现在先去:
【已解决】Mac中已安装MongoDB但运行mongod出错:exception in initAndListen: NonExistentPath: Data directory /data/db not found
然后再去:
源电脑:导出MongoDB数据
参考:
去操作:
1 2 3 4 5 6 7 | ➜ mongodb_migration git:(master) mongodump -d storybook -o . 2018-11-26T11:58:21.944+0800 writing storybook.scholastic to 2018-11-26T11:58:21.946+0800 writing storybook.lexile to 2018-11-26T11:58:21.946+0800 writing storybook.main to 2018-11-26T11:58:23.101+0800 done dumping storybook.lexile (29911 documents) 2018-11-26T11:58:23.353+0800 done dumping storybook.scholastic (51785 documents) 2018-11-26T11:58:24.451+0800 done dumping storybook.main (51785 documents) |

目标电脑:导入MongoDB数据
拷贝数据过来后:

去导入:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | macdeMacBook-Pro:mongodb_migration mac$ pwd /Users/mac/working/dev_root/xxx/projects/crawler_projects/crawler_fablexile_book/debug/mongodb_migration macdeMacBook-Pro:mongodb_migration mac$ ls -lha total 102928 drwxr-xr-x 5 mac staff 160B 11 26 13:32 . drwxr-xr-x 3 mac staff 96B 11 26 13:31 .. -rw-r--r--@ 1 mac staff 6.0K 11 26 13:32 .DS_Store -rw-r--r--@ 1 mac staff 50M 11 25 19:59 mongodb_storybook_20181126.zip drwxr-xr-x@ 8 mac staff 256B 11 25 19:58 storybook macdeMacBook-Pro:mongodb_migration mac$ mongorestore -d storybook ./storybook 2018-11-26T13:33:25.550-0800 the --db and --collection args should only be used when restoring from a BSON file. Other uses are deprecated and will not exist in the future; use --nsInclude instead 2018-11-26T13:33:25.550-0800 building a list of collections to restore from storybook dir 2018-11-26T13:33:25.552-0800 reading metadata for storybook.main from storybook/main.metadata.json 2018-11-26T13:33:25.553-0800 reading metadata for storybook.scholastic from storybook/scholastic.metadata.json 2018-11-26T13:33:25.553-0800 reading metadata for storybook.lexile from storybook/lexile.metadata.json 2018-11-26T13:33:25.687-0800 restoring storybook.main from storybook/main.bson 2018-11-26T13:33:25.826-0800 restoring storybook.scholastic from storybook/scholastic.bson 2018-11-26T13:33:25.950-0800 restoring storybook.lexile from storybook/lexile.bson 2018-11-26T13:33:27.512-0800 no indexes to restore 2018-11-26T13:33:27.512-0800 finished restoring storybook.lexile (29911 documents) 2018-11-26T13:33:28.407-0800 no indexes to restore 2018-11-26T13:33:28.407-0800 finished restoring storybook.scholastic (51785 documents) 2018-11-26T13:33:28.547-0800 [ ###################.....] storybook.main 87.5MB/106MB (82.6%) 2018-11-26T13:33:29.134-0800 [ ########################] storybook.main 106MB/106MB (100.0%) 2018-11-26T13:33:29.134-0800 no indexes to restore 2018-11-26T13:33:29.134-0800 finished restoring storybook.main (51785 documents) 2018-11-26T13:33:29.134-0800 done |
然后去用工具看看数据是否导入:
去:
然后打开本地MongoDB,确认数据是对的:

然后目标文件拷贝到了data目录:

然后再去重建pipenv环境:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | macdeMacBook-Pro:projects_git mac$ cd /Users/mac/working/dev_root/xxx/projects_git/crawler_projects/crawler_fablexile_book macdeMacBook-Pro:crawler_fablexile_book mac$ pwd /Users/mac/working/dev_root/xxx/projects_git/crawler_projects/crawler_fablexile_book macdeMacBook-Pro:crawler_fablexile_book mac$ ls -l total 56 -rw-r--r-- 1 mac staff 18272 11 26 13:46 FabLexileBook.py -rw-r--r-- 1 mac staff 276 11 26 13:46 Pipfile -rw-r--r-- 1 mac staff 3111 11 26 13:46 README.md drwxr-xr-x 3 mac staff 96 11 26 13:46 tools macdeMacBook-Pro:crawler_fablexile_book mac$ cd /Users/mac/working/dev_root/xxx/projects_git/crawler_projects/crawler_fablexile_book macdeMacBook-Pro:crawler_fablexile_book mac$ pipenv install --skip-lock Creating a virtualenv for this project… Pipfile: /Users/mac/working/dev_root/xxx/projects_git/crawler_projects/crawler_fablexile_book/Pipfile Using /Library/Frameworks/Python .framework /Versions/3 .6 /bin/python3 .6 (3.6.7) to create virtualenv… ⠹Running virtualenv with interpreter /Library/Frameworks/Python .framework /Versions/3 .6 /bin/python3 .6 Using base prefix '/Library/Frameworks/Python.framework/Versions/3.6' New python executable in /Users/mac/ . local /share/virtualenvs/crawler_fablexile_book-4ZfM-yMK/bin/python3 .6 Also creating executable in /Users/mac/ . local /share/virtualenvs/crawler_fablexile_book-4ZfM-yMK/bin/python Installing setuptools, pip, wheel... done . Virtualenv location: /Users/mac/ . local /share/virtualenvs/crawler_fablexile_book-4ZfM-yMK Installing dependencies from Pipfile…
/5 — 00:01:09 To activate this project's virtualenv, run pipenv shell. Alternatively, run a command inside the virtualenv with pipenv run. macdeMacBook-Pro:crawler_fablexile_book mac$ pipenv shell Launching subshell in virtual environment… bash -3.2$ . /Users/mac/ . local /share/virtualenvs/crawler_fablexile_book-4ZfM-yMK/bin/activate (crawler_fablexile_book) bash -3.2$ which python /Users/mac/ . local /share/virtualenvs/crawler_fablexile_book-4ZfM-yMK/bin/python (crawler_fablexile_book) bash -3.2$ python --version Python 3.6.7 |
然后就可以去试试:
运行pyspider,看看能否继续恢复运行了:
1 2 3 4 5 6 | (crawler_fablexile_book) bash-3.2$ pyspider [W 181126 13:52:40 run:413] phantomjs not found, continue running without it. [I 181126 13:52:42 result_worker:49] result_worker starting... [I 181126 13:52:42 processor:211] processor starting... ^C[I 181126 13:52:42 result_worker:66] result_worker exiting... [I 181126 13:52:42 processor:229] processor exiting... |
结果找不到phantomjs,所以去下载和安装
参考自己之前的:
【已解决】Mac中安装phantomjs
去安装:
1 2 | brew tap homebrew /cask brew cask install phantomjs |
结果下载phantomjs却花了好半天时间,最后ss中换了sg的节点,才能继续下载:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | xxx - Mac - 2013 - Late:~ mac$ brew cask install phantomjs = = > Caveats phantomjs has been officially discontinued upstream. It may stop working correctly ( or at all ) in recent versions of macOS. = = > Satisfying dependencies = = > Downloading https: / / bitbucket.org / ariya / phantomjs / downloads / phantomjs - 2.1 . 1 - macosx. zip = = > Downloading from https: / / bbuseruploads.s3.amazonaws.com / fd96ed93 - 2b32 - 46a7 - 9d2b - ecbc0988516a / downloads / 8543ae7d - 9ac7 - 43d3 - 9052 - 537d63f16d66 / phantomjs - 2.1 . 1 - # 1.7%^C xxx - Mac - 2013 - Late:~ mac$ xxx - Mac - 2013 - Late:~ mac$ brew cask install phantomjs Updating Homebrew... = = > Caveats phantomjs has been officially discontinued upstream. It may stop working correctly ( or at all ) in recent versions of macOS. = = > Satisfying dependencies = = > Downloading https: / / bitbucket.org / ariya / phantomjs / downloads / phantomjs - 2.1 . 1 - macosx. zip = = > Downloading from https: / / bbuseruploads.s3.amazonaws.com / fd96ed93 - 2b32 - 46a7 - 9d2b - ecbc0988516a / downloads / 8543ae7d - 9ac7 - 43d3 - 9052 - 537d63f16d66 / phantomjs - 2.1 . 1 - ######################################################################## 100.0% = = > Verifying SHA - 256 checksum for Cask 'phantomjs' . = = > Installing Cask phantomjs = = > Creating Caskroom at / usr / local / Caskroom = = > We 'll set permissions properly so we won' t need sudo in the future. Password: Sorry, try again. Password: = = > Linking Binary 'phantomjs' to '/usr/local/bin/phantomjs' .
xxx - Mac - 2013 - Late:~ mac$ which phantomjs / usr / local / bin / phantomjs |
然后运行:
1 | pyspider |
再去设置状态为RUNNING:

就可以继续运行了。
目前看起来,下载速度还不错:

转载请注明:在路上 » 【已解决】PySpider项目迁移到别的电脑重新继续运行