【问题】
用Python代码:
1 2 3 4 5 6 7 8 9 10 | photoInfoJsonAddQuote = re.sub(r "(,?)(\w+?)\s*?:" , r "\1'\2':" , photoInfoJson); logging.debug( "photoInfoJsonAddQuote=%s" , photoInfoJsonAddQuote); photoInfoJsonDoubleQuote = photoInfoJsonAddQuote.replace( "'" , "\"" ); logging.debug( "photoInfoJsonDoubleQuote=%s" , photoInfoJsonDoubleQuote); #remove comma before end of list afterRemoveLastCommaInList = re.sub(r ",\s*?]" , "]" , photoInfoJsonDoubleQuote); logging.debug( "photoInfoJsonDoubleQuote=%s" , afterRemoveLastCommaInList); photoInfoDict = json.loads(afterRemoveLastCommaInList); |
处理:
http://www.yupoo.com/photos/314159subin/87556247/
中的html中的:
{id:’34393-87556247′,owner:’34393′,ownername:’314159subin’,title:’雪舞飞扬’,description:’1月3日新年伊始,一场大雪让杭城\x26quot;雪舞飞扬\x26quot;.’,bucket:’314159subin’,key:’CxsKKLUc’,license:0,stats_notes: 0,albums: [‘34393-1693900′,],tags:[{name:’城事’, author: ‘34393’},{name:’西湖’, author: ‘34393’},{name:’纪实’, author: ‘34393’},{name:’街头’, author: ‘34393’},{name:’城市’, author: ‘34393’},{name:’杭州’, author: ‘34393’},{name:’动物’, author: ‘34393’},{name:’下雪喽’, author: ‘34393’},{name:’雪景’, author: ‘34393’}],owner:{id: 34393,username: ‘314159subin’,nickname: ‘三点一四一屋酒’}} |
结果出错:
photoInfoDict = json.loads(afterRemoveLastCommaInList); return _default_decoder.decode(s) File "D:\tmp\dev_install_root\Python27_x64\lib\json\decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "D:\tmp\dev_install_root\Python27_x64\lib\json\decoder.py", line 382, in raw_decode obj, end = self.scan_once(s, idx) ValueError: Invalid \escape: line 1 column 145 (char 145) |
【解决过程】
1.很明显,是\xxx不支持
所以要想办法去对字符串处理后,再调用json.loads。
2.参考:
How to convert this string into json format using json.loads
然后自己想办法写代码去处理。
3.后来发现,此处只是
\x26XXX;
应该变成:
&XXX;
就是html中的entity了。
所以,写出下面代码,即可正确处理:
1 2 3 4 5 6 7 8 | #{"id":"34393-87556247","owner":"34393","ownername":"314159subin","title":"雪舞飞扬","description":"1月3日新年伊始,一场大雪让杭城\x26quot;雪舞飞扬\x26quot;.","bucket":"314159subin","key":"CxsKKLUc","license":0,"stats_notes": 0,"albums": ["34393-1693900"],"tags":[{"name":"城事", "author": "34393"},{"name":"西湖", "author": "34393"},{"name":"纪实", "author": "34393"},{"name":"街头", "author": "34393"},{"name":"城市", "author": "34393"},{"name":"杭州", "author": "34393"},{"name":"动物", "author": "34393"},{"name":"下雪喽", "author": "34393"},{"name":"雪景", "author": "34393"}],"owner":{"id": 34393,"username": "314159subin","nickname": "三点一四一屋酒"}} replacedQuoteJson = re.sub(r "\\x26([a-zA-Z]{2,6});" , r "&\1;" , afterRemoveLastCommaInList); logging.info( "replacedQuoteJson=%s" , replacedQuoteJson); photoInfoDict = json.loads(replacedQuoteJson); logging.info( "photoInfoDict=%s" , photoInfoDict); |
【总结】
json.loads出现
ValueError: Invalid \escape
的错误时,找到其中的
\xXXX的部分
然后处理成自己所需要的内容,即可。
比如:
此处用:
1 | replacedQuoteJson = re.sub(r "\\x26([a-zA-Z]{2,6});" , r "&\1;" , inputJson); |
就可以把:
一场大雪让杭城\x26quot;雪舞飞扬\x26quot; |
处理成:
一场大雪让杭城"雪舞飞扬" |
之后json就可以正确解析了。
转载请注明:在路上 » 【已解决】Python中json.loads出错:ValueError: Invalid \escape: line 1 column 145 (char 145)