最新消息:20210816 当前crifan.com域名已被污染,为防止失联,请关注(页面右下角的)公众号

【已解决】Python中json.loads出错:ValueError: Invalid \escape: line 1 column 145 (char 145)

JSON crifan 15582浏览 0评论

【问题】

用Python代码:

        photoInfoJsonAddQuote = re.sub(r"(,?)(\w+?)\s*?:", r"\1'\2':", photoInfoJson);
        logging.debug("photoInfoJsonAddQuote=%s", photoInfoJsonAddQuote);
        photoInfoJsonDoubleQuote = photoInfoJsonAddQuote.replace("'", "\"");
        logging.debug("photoInfoJsonDoubleQuote=%s", photoInfoJsonDoubleQuote);
        
        #remove comma before end of list
        afterRemoveLastCommaInList = re.sub(r",\s*?]", "]", photoInfoJsonDoubleQuote);
        logging.debug("photoInfoJsonDoubleQuote=%s", afterRemoveLastCommaInList);
        
        photoInfoDict = json.loads(afterRemoveLastCommaInList);

处理:

http://www.yupoo.com/photos/314159subin/87556247/

中的html中的:

{id:’34393-87556247′,owner:’34393′,ownername:’314159subin’,title:’雪舞飞扬’,description:’1月3日新年伊始,一场大雪让杭城\x26quot;雪舞飞扬\x26quot;.’,bucket:’314159subin’,key:’CxsKKLUc’,license:0,stats_notes: 0,albums: [‘34393-1693900′,],tags:[{name:’城事’, author: ‘34393’},{name:’西湖’, author: ‘34393’},{name:’纪实’, author: ‘34393’},{name:’街头’, author: ‘34393’},{name:’城市’, author: ‘34393’},{name:’杭州’, author: ‘34393’},{name:’动物’, author: ‘34393’},{name:’下雪喽’, author: ‘34393’},{name:’雪景’, author: ‘34393’}],owner:{id: 34393,username: ‘314159subin’,nickname: ‘三点一四一屋酒’}}

结果出错:

    photoInfoDict = json.loads(afterRemoveLastCommaInList);

  File "D:\tmp\dev_install_root\Python27_x64\lib\json\__init__.py", line 326, in loads

    return _default_decoder.decode(s)

  File "D:\tmp\dev_install_root\Python27_x64\lib\json\decoder.py", line 366, in decode

    obj, end = self.raw_decode(s, idx=_w(s, 0).end())

  File "D:\tmp\dev_install_root\Python27_x64\lib\json\decoder.py", line 382, in raw_decode

    obj, end = self.scan_once(s, idx)

ValueError: Invalid \escape: line 1 column 145 (char 145)

 

【解决过程】

1.很明显,是\xxx不支持

所以要想办法去对字符串处理后,再调用json.loads。

2.参考:

How to convert this string into json format using json.loads

然后自己想办法写代码去处理。

3.后来发现,此处只是

\x26XXX;

应该变成:

&XXX;

就是html中的entity了。

所以,写出下面代码,即可正确处理:

        #http://www.yupoo.com/photos/314159subin/87556247/
        #{"id":"34393-87556247","owner":"34393","ownername":"314159subin","title":"雪舞飞扬","description":"1月3日新年伊始,一场大雪让杭城\x26quot;雪舞飞扬\x26quot;.","bucket":"314159subin","key":"CxsKKLUc","license":0,"stats_notes": 0,"albums": ["34393-1693900"],"tags":[{"name":"城事", "author": "34393"},{"name":"西湖", "author": "34393"},{"name":"纪实", "author": "34393"},{"name":"街头", "author": "34393"},{"name":"城市", "author": "34393"},{"name":"杭州", "author": "34393"},{"name":"动物", "author": "34393"},{"name":"下雪喽", "author": "34393"},{"name":"雪景", "author": "34393"}],"owner":{"id": 34393,"username": "314159subin","nickname": "三点一四一屋酒"}}
        
        replacedQuoteJson = re.sub(r"\\x26([a-zA-Z]{2,6});", r"&\1;", afterRemoveLastCommaInList);
        logging.info("replacedQuoteJson=%s", replacedQuoteJson);

        photoInfoDict = json.loads(replacedQuoteJson);
        logging.info("photoInfoDict=%s", photoInfoDict);

 

【总结】

json.loads出现

ValueError: Invalid \escape

的错误时,找到其中的

\xXXX的部分

然后处理成自己所需要的内容,即可。

 

比如:

此处用:

replacedQuoteJson = re.sub(r"\\x26([a-zA-Z]{2,6});", r"&\1;", inputJson);

就可以把:

一场大雪让杭城\x26quot;雪舞飞扬\x26quot;

处理成:

一场大雪让杭城"雪舞飞扬"

之后json就可以正确解析了。

转载请注明:在路上 » 【已解决】Python中json.loads出错:ValueError: Invalid \escape: line 1 column 145 (char 145)

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
84 queries in 0.183 seconds, using 22.11MB memory