【问题】
用Python代码:
photoInfoJsonAddQuote = re.sub(r"(,?)(\w+?)\s*?:", r"\1'\2':", photoInfoJson); logging.debug("photoInfoJsonAddQuote=%s", photoInfoJsonAddQuote); photoInfoJsonDoubleQuote = photoInfoJsonAddQuote.replace("'", "\""); logging.debug("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote); #remove comma before end of list afterRemoveLastCommaInList = re.sub(r",\s*?]", "]", photoInfoJsonDoubleQuote); logging.debug("photoInfoJsonDoubleQuote=%s", afterRemoveLastCommaInList); photoInfoDict = json.loads(afterRemoveLastCommaInList);
处理:
http://www.yupoo.com/photos/314159subin/87556247/
中的html中的:
{id:’34393-87556247′,owner:’34393′,ownername:’314159subin’,title:’雪舞飞扬’,description:’1月3日新年伊始,一场大雪让杭城\x26quot;雪舞飞扬\x26quot;.’,bucket:’314159subin’,key:’CxsKKLUc’,license:0,stats_notes: 0,albums: [‘34393-1693900′,],tags:[{name:’城事’, author: ‘34393’},{name:’西湖’, author: ‘34393’},{name:’纪实’, author: ‘34393’},{name:’街头’, author: ‘34393’},{name:’城市’, author: ‘34393’},{name:’杭州’, author: ‘34393’},{name:’动物’, author: ‘34393’},{name:’下雪喽’, author: ‘34393’},{name:’雪景’, author: ‘34393’}],owner:{id: 34393,username: ‘314159subin’,nickname: ‘三点一四一屋酒’}} |
结果出错:
photoInfoDict = json.loads(afterRemoveLastCommaInList); return _default_decoder.decode(s) File "D:\tmp\dev_install_root\Python27_x64\lib\json\decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "D:\tmp\dev_install_root\Python27_x64\lib\json\decoder.py", line 382, in raw_decode obj, end = self.scan_once(s, idx) ValueError: Invalid \escape: line 1 column 145 (char 145) |
【解决过程】
1.很明显,是\xxx不支持
所以要想办法去对字符串处理后,再调用json.loads。
2.参考:
How to convert this string into json format using json.loads
然后自己想办法写代码去处理。
3.后来发现,此处只是
\x26XXX;
应该变成:
&XXX;
就是html中的entity了。
所以,写出下面代码,即可正确处理:
#http://www.yupoo.com/photos/314159subin/87556247/ #{"id":"34393-87556247","owner":"34393","ownername":"314159subin","title":"雪舞飞扬","description":"1月3日新年伊始,一场大雪让杭城\x26quot;雪舞飞扬\x26quot;.","bucket":"314159subin","key":"CxsKKLUc","license":0,"stats_notes": 0,"albums": ["34393-1693900"],"tags":[{"name":"城事", "author": "34393"},{"name":"西湖", "author": "34393"},{"name":"纪实", "author": "34393"},{"name":"街头", "author": "34393"},{"name":"城市", "author": "34393"},{"name":"杭州", "author": "34393"},{"name":"动物", "author": "34393"},{"name":"下雪喽", "author": "34393"},{"name":"雪景", "author": "34393"}],"owner":{"id": 34393,"username": "314159subin","nickname": "三点一四一屋酒"}} replacedQuoteJson = re.sub(r"\\x26([a-zA-Z]{2,6});", r"&\1;", afterRemoveLastCommaInList); logging.info("replacedQuoteJson=%s", replacedQuoteJson); photoInfoDict = json.loads(replacedQuoteJson); logging.info("photoInfoDict=%s", photoInfoDict);
【总结】
json.loads出现
ValueError: Invalid \escape
的错误时,找到其中的
\xXXX的部分
然后处理成自己所需要的内容,即可。
比如:
此处用:
replacedQuoteJson = re.sub(r"\\x26([a-zA-Z]{2,6});", r"&\1;", inputJson);
就可以把:
一场大雪让杭城\x26quot;雪舞飞扬\x26quot; |
处理成:
一场大雪让杭城"雪舞飞扬" |
之后json就可以正确解析了。
转载请注明:在路上 » 【已解决】Python中json.loads出错:ValueError: Invalid \escape: line 1 column 145 (char 145)