某台服务器中,一直正在正常运行着MongoDB的服务:
mongod
但是突然某次发现一个系统无法使用了,去看log发现Flask中报错是mongodb的问题:
<code>[2018-11-30 20:53:12,763 ERROR 21362 MainProcess 140539632912712 DummyThread-952 app.py:1761 log_exception] Exception on /storybook [GET] Traceback (most recent call last): File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/flask/app.py", line 1811, in full_dispatch_request rv = self.preprocess_request() File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/flask/app.py", line 2087, in preprocess_request rv = func() File "/root/xxx/app.py", line 103, in collectApiAccessLog insertedLog = logCollection.insert_one(requestInfoDict) File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/collection.py", line 693, in insert_one session=session), File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/collection.py", line 607, in _insert bypass_doc_val, session) File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/collection.py", line 595, in _insert_one acknowledged, _insert_command, session) File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1242, in _retryable_write with self._tmp_session(session) as s: File "/root/miniconda3/lib/python3.6/contextlib.py", line 81, in __enter__ return next(self.gen) File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1571, in _tmp_session s = self._ensure_session(session) File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1558, in _ensure_session return self.__start_session(True, causal_consistency=False) File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1511, in __start_session server_session = self._get_server_session() File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1544, in _get_server_session return self._topology.get_server_session() File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/topology.py", line 427, in get_server_session None) File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/topology.py", line 199, in _select_servers_loop self._error_message(selector)) pymongo.errors.ServerSelectionTimeoutError: localhost:32018: [Errno 111] Connection refused </code>
所以去找找什么原因
先去看看状态:
发现是running exited
然后尝试重启,结果报错
然后再去看状态就是failed的了:
<code>[root@xx-general-01 logs]# service mongod status ● mongod.service - SYSV: Mongo is a scalable, document-oriented database. Loaded: loaded (/etc/rc.d/init.d/mongod; bad; vendor preset: disabled) Active: failed (Result: exit-code) since Fri 2018-11-30 20:25:22 CST; 52min ago Docs: man:systemd-sysv-generator(8) Process: 4391 ExecStop=/etc/rc.d/init.d/mongod stop (code=exited, status=0/SUCCESS) Process: 4408 ExecStart=/etc/rc.d/init.d/mongod start (code=exited, status=1/FAILURE) Nov 30 20:25:22 xx-general-01 systemd[1]: Starting SYSV: Mongo is a scalable, document-oriented database.... Nov 30 20:25:22 xx-general-01 runuser[4419]: pam_unix(runuser:session): session opened for user mongod by (uid=0) Nov 30 20:25:22 xx-general-01 mongod[4408]: Starting mongod: [FAILED] Nov 30 20:25:22 xx-general-01 systemd[1]: mongod.service: control process exited, code=exited status=1 Nov 30 20:25:22 xx-general-01 systemd[1]: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Nov 30 20:25:22 xx-general-01 systemd[1]: Unit mongod.service entered failed state. Nov 30 20:25:22 xx-general-01 systemd[1]: mongod.service failed. </code>
pymongo.errors.ServerSelectionTimeoutError localhost Errno 111 Connection refused
还是要去看看原因:
<code>[root@xx-general-01 logs]# service mongod restart Restarting mongod (via systemctl): Job for mongod.service failed because the control process exited with error code. See "systemctl status mongod.service" and "journalctl -xe" for details. [FAILED] [root@xx-general-01 logs]# journalctl -xe -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- Documentation: http://www.freedesktop.org/wiki/Software/systemd/multiseat -- -- A new session with the ID 34727 has been created for the user root. -- -- The leading process of the session is 4510. Nov 30 21:14:59 x-general-01 systemd[1]: Started Session 34727 of user root. -- Subject: Unit session-34727.scope has finished start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit session-34727.scope has finished starting up. -- -- The start-up result is done. Nov 30 21:14:59 xx-general-01 systemd[1]: Starting Session 34727 of user root. -- Subject: Unit session-34727.scope has begun start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit session-34727.scope has begun starting up. Nov 30 21:14:59 xx-general-01 sshd[4510]: pam_unix(sshd:session): session opened for user root by (uid=0) Nov 30 21:18:45 xx-general-01 polkitd[505]: Registered Authentication Agent for unix-process:4568:1729661674 (system bus name :1.69510 [/usr/bin/pkttyagent --notify-fd 5 --fallbac Nov 30 21:18:45 xx-general-01 systemd[1]: Starting SYSV: Mongo is a scalable, document-oriented database.... -- Subject: Unit mongod.service has begun start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit mongod.service has begun starting up. Nov 30 21:18:45 xx-general-01 runuser[4586]: pam_unix(runuser:session): session opened for user mongod by (uid=0) Nov 30 21:18:45 xx-general-01 runuser[4586]: pam_unix(runuser:session): session closed for user mongod Nov 30 21:18:45 xx-general-01 mongod[4574]: Starting mongod: [FAILED] Nov 30 21:18:45 xx-general-01 systemd[1]: mongod.service: control process exited, code=exited status=1 Nov 30 21:18:45 xx-general-01 systemd[1]: Failed to start SYSV: Mongo is a scalable, document-oriented database.. -- Subject: Unit mongod.service has failed -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit mongod.service has failed. -- -- The result is failed. Nov 30 21:18:45 xx-general-01 systemd[1]: Unit mongod.service entered failed state. Nov 30 21:18:45 x-general-01 systemd[1]: mongod.service failed. Nov 30 21:18:45 x-general-01 polkitd[505]: Unregistered Authentication Agent for unix-process:4568:1729661674 (system bus name :1.69510, object path /org/freedesktop/PolicyKit1/A [root@xx-general-01 logs]# </code>
搜:
Failed to start SYSV: Mongo is a scalable, document-oriented database
Can not start MongoDB 3.2.1 on CentOS 7 – Stack Overflow
MongoDB fails starting as a service on CentOS 7 – Stack Overflow
好像是文件夹权限不对了?
Unable to start MongoDB 3.0.2 service on CentOS 7 – Stack Overflow
Can’t start MongoDB on CentOS7 – Stack Overflow
先去看看:
<code>[root@xxx-general-01 logs]# ll /var/lib/mongod ls: cannot access /var/lib/mongod: No such file or directory [root@xxx-general-01 logs]# ll /var/lib/mongo/ total 15010456 -rw-r--r-- 1 mongod mongod 14888894464 Aug 28 11:40 collection-0-2140692489680982673.wt -rw-r--r-- 1 mongod mongod 20480 Aug 28 11:40 collection-0--2650623679816886832.wt -rw-r--r-- 1 mongod mongod 40960 Aug 28 11:41 collection-0--3691121321241124749.wt -rw-r--r-- 1 mongod mongod 397312 Nov 30 20:13 collection-10-733562134066934475.wt -rw-r--r-- 1 mongod mongod 4096 Nov 6 14:47 collection-12-733562134066934475.wt -rw-r--r-- 1 mongod mongod 876544 Aug 28 11:40 collection-2-2140692489680982673.wt -rw-r--r-- 1 mongod mongod 16384 Aug 28 11:40 collection-2--3691121321241124749.wt -rw-r--r-- 1 mongod mongod 291508224 Aug 28 11:40 collection-2--5069085438818245657.wt -rw-r--r-- 1 mongod mongod 16384 Aug 28 11:40 collection-4--3691121321241124749.wt -rw-r--r-- 1 mongod mongod 147787776 Nov 23 14:55 collection-4-733562134066934475.wt -rw-r--r-- 1 mongod mongod 36864 Nov 1 18:53 collection-6--3691121321241124749.wt -rw-r--r-- 1 mongod mongod 36298752 Oct 30 13:59 collection-6-733562134066934475.wt drwxr-xr-x 2 mongod mongod 4096 Nov 30 18:21 diagnostic.data -rw-r--r-- 1 mongod mongod 258048 Sep 19 11:09 index-0--2784828041412477720.wt -rw-r--r-- 1 mongod mongod 102400 Nov 30 20:13 index-11-733562134066934475.wt -rw-r--r-- 1 mongod mongod 684032 Sep 19 11:09 index-1-2140692489680982673.wt -rw-r--r-- 1 mongod mongod 20480 Apr 8 2018 index-1--2650623679816886832.wt -rw-r--r-- 1 mongod mongod 36864 Aug 28 11:41 index-1--3691121321241124749.wt -rw-r--r-- 1 mongod mongod 4096 Nov 6 14:47 index-13-733562134066934475.wt -rw-r--r-- 1 mongod mongod 462848 Nov 20 23:50 index-14-733562134066934475.wt -rw-r--r-- 1 mongod mongod 294912 Nov 6 15:36 index-15-733562134066934475.wt -rw-r--r-- 1 mongod mongod 94208 Aug 28 11:43 index-3-2140692489680982673.wt -rw-r--r-- 1 mongod mongod 16384 Apr 2 2018 index-3--3691121321241124749.wt -rw-r--r-- 1 mongod mongod 512000 Oct 31 11:27 index-3--5069085438818245657.wt -rw-r--r-- 1 mongod mongod 663552 Aug 28 11:43 index-4-2140692489680982673.wt -rw-r--r-- 1 mongod mongod 16384 Aug 28 11:40 index-5--3691121321241124749.wt -rw-r--r-- 1 mongod mongod 479232 Oct 30 13:59 index-5-733562134066934475.wt -rw-r--r-- 1 mongod mongod 36864 Nov 1 18:53 index-7--3691121321241124749.wt -rw-r--r-- 1 mongod mongod 815104 Oct 30 13:59 index-7-733562134066934475.wt -rw-r--r-- 1 mongod mongod 36864 Nov 1 18:53 index-8--3691121321241124749.wt drwxr-xr-x 2 mongod mongod 4096 Nov 23 14:54 journal -rw-r--r-- 1 mongod mongod 36864 Nov 6 14:54 _mdb_catalog.wt -rw-r--r-- 1 mongod mongod 6 Aug 28 11:40 mongod.lock -rw-r--r-- 1 mongod mongod 36864 Nov 30 17:34 sizeStorer.wt -rw-r--r-- 1 mongod mongod 95 Mar 20 2018 storage.bson -rw-r--r-- 1 mongod mongod 49 Mar 20 2018 WiredTiger -rw-r--r-- 1 mongod mongod 4096 Aug 28 11:40 WiredTigerLAS.wt -rw-r--r-- 1 mongod mongod 21 Mar 20 2018 WiredTiger.lock -rw-r--r-- 1 mongod mongod 1011 Nov 30 17:35 WiredTiger.turtle -rw-r--r-- 1 mongod mongod 102400 Nov 30 20:13 WiredTiger.wt </code>
好像记得:不是用service去启动mongod的?
去找找,此处是如何运行mongod的
参考之前:
【无需解决】mongod的状态中active但却existed是什么意思
去看看log:
<code>cd /var/log/mongodb/ tail mongod.log </code>
和:
【已解决】mongo启动失败:connection /var/lib/mongo/WiredTiger.turtle handle-open open Permission denied
中的:
<code>/etc/mongod.conf </code>
后来发现之前是:
即可正常(在当前登录用户root的情况下,换用mongd的用户去)启动mongod:
<code>sudo -u mongod mongod -f /etc/mongod.conf </code>
去启动mongod的
先去看看log:
/var/log/mongodb/mongod.log
然后下载下来打开后,无意间发现:
<code>2018-11-30T20:13:45.560+0800 E STORAGE [thread1] WiredTiger (28) [1543580025:560859][15915:0x7f5d0df0b700], file:WiredTiger.wt, WT_SESSION.checkpoint: /var/lib/mongo/WiredTiger.wt: handle-write: pwrite: failed to write 28672 bytes at offset 102400: No space left on device 2018-11-30T20:13:45.561+0800 E STORAGE [thread1] WiredTiger (28) [1543580025:561033][15915:0x7f5d0df0b700], checkpoint-server: checkpoint server error: No space left on device 2018-11-30T20:13:45.561+0800 E STORAGE [thread1] WiredTiger (-31804) [1543580025:561047][15915:0x7f5d0df0b700], checkpoint-server: the process must exit and restart: WT_PANIC: WiredTiger library panic </code>
然后再去看磁盘空间,真的被用完了:
<code>[root@xx-general-01 mongodb]# df -h Filesystem Size Used Avail Use% Mounted on /dev/vda1 50G 47G 0 100% / devtmpfs 7.8G 0 7.8G 0% /dev tmpfs 7.8G 0 7.8G 0% /dev/shm tmpfs 7.8G 532K 7.8G 1% /run tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup tmpfs 1.6G 0 1.6G 0% /run/user/0 </code>
然后去删了个1.5G的文件。
结果再去看的
df -h
Avail还是0,Use还是100%
后来再去删了500多MB的文件,再去看最新df,然后终于有空出来的空间了:
<code>[root@xx-general-01 origin_data]# df -lh Filesystem Size Used Avail Use% Mounted on /dev/vda1 50G 47G 431M 100% / devtmpfs 7.8G 0 7.8G 0% /dev tmpfs 7.8G 0 7.8G 0% /dev/shm tmpfs 7.8G 532K 7.8G 1% /run tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup tmpfs 1.6G 0 1.6G 0% /run/user/0 </code>
然后再去重启mongod,就可以了:
<code>[root@xx-general-01 origin_data]# service mongod restart Restarting mongod (via systemctl): [ OK ] [root@xx-general-01 origin_data]# service mongod status ● mongod.service - SYSV: Mongo is a scalable, document-oriented database. Loaded: loaded (/etc/rc.d/init.d/mongod; bad; vendor preset: disabled) Active: active (running) since Fri 2018-11-30 21:49:51 CST; 4s ago Docs: man:systemd-sysv-generator(8) Process: 4391 ExecStop=/etc/rc.d/init.d/mongod stop (code=exited, status=0/SUCCESS) Process: 4693 ExecStart=/etc/rc.d/init.d/mongod start (code=exited, status=0/SUCCESS) CGroup: /system.slice/mongod.service └─4708 /usr/bin/mongod -f /etc/mongod.conf Nov 30 21:49:50 xx-general-01 systemd[1]: Starting SYSV: Mongo is a scalable, document-oriented database.... Nov 30 21:49:50 xx-general-01 runuser[4704]: pam_unix(runuser:session): session opened for user mongod by (uid=0) Nov 30 21:49:51 x-general-01 runuser[4704]: pam_unix(runuser:session): session closed for user mongod Nov 30 21:49:51 x-general-01 mongod[4693]: Starting mongod: [ OK ] Nov 30 21:49:51 x-general-01 systemd[1]: Started SYSV: Mongo is a scalable, document-oriented database.. [root@xx-general-01 origin_data]# </code>
然后之前一个项目xx产生的log:
logs/development.log
太多:
所以去修改代码,去掉很多
logger.debug和logger.info的代码。
最终减少了不少log。
不过,此处还是由于之前同事设计有问题,导致django的日志中有mysql的操作的log,其中某个操作的查询操作太弱智:
会产生大量的util的log
-》导致随随便便用户使用几次,就会产生大量的log:
所以,也是无法再继续优化了。
只能想办法以后重构这种垃圾代码了。
【总结】
此处某web应用无法正常使用,表面现象看起来是,通过log查到的:
<code>pymongo.errors.ServerSelectionTimeoutError: localhost:32018: [Errno 111] Connection refused </code>
而再去通过:
<code>service mongod restart </code>
却无法重启MongoDB,会失败。
其表面现象看到的是:
<code>Failed to start SYSV: Mongo is a scalable, document-oriented database.. </code>
实际上找到MongoDB的log(此处的:
<code>/var/log/mongodb/mongod.log </code>
),从而发现:
<code>E STORAGE [thread1] WiredTiger ... failed to write 28672 bytes at offset 102400: No space left on device </code>
很明显是磁盘空间不够了。
-》其中的:
E STORAGE
也表示是存储方面的错误。
-》然后通过
<code>df -hl </code>
看到是当前分区可用空间是0了。
-》最后是去删除掉一些无用数据,腾出2G空间
-》但是注意:
刚删除完毕后,去用df看到的结果没变化,可用空间还是0,要稍等一会,才能更新
-》之后再去:
<code>service mongod restart </code>
就可以了。
转载请注明:在路上 » 【已解决】MongoDB服务报错:pymongo.errors.ServerSelectionTimeoutError localhost Errno 111 Connection refused