2.2.2. 将实体定义替换为字符:repUniNumEntToChar
#------------------------------------------------------------------------------
# replace the &#N; (N is digit number, N > 1) to unicode char
# eg: replace "'" with "'" in "Creepin' up on you"
def repUniNumEntToChar(text):
unicodeP = re.compile('&#[0-9]+;');
def transToUniChr(match): # translate the matched string to unicode char
numStr = match.group(0)[2:-1]; # remove '&#' and ';'
num = int(numStr);
unicodeChar = unichr(num);
return unicodeChar;
return unicodeP.sub(transToUniChr, text);
例 2.6. repUniNumEntToChar的使用范例
infoDict['title'] = repUniNumEntToChar(infoDict['title']);