Q:
请教怎样使用Python正则表达式,进行汉语成语模式搜索-CSDN论坛
“问题描述:使用Python正则表达式,进行汉语成语的模式搜索
搜索目的地:汉语成语词典
搜索目标:几种特殊模式的成语,例如:
(1)xxyy模式的,高高兴兴,快快乐乐
(2)数字模式的,三心二意,一泻千里
(3)动物模式的,鸡鸣狗盗,狐假虎威
(4)……
先将汉语成语文件准备好,再在文件中,使用正则表达式,进行搜索。搜索结果,显示在屏幕上,同时保存到一个结果文件中。”
A:
把完整代码给你写出来了:
# Function: # 请教怎样使用Python正则表达式,进行汉语成语模式搜索-CSDN论坛 # https://bbs.csdn.net/topics/396860414 # Author: Crifan # Update: 20200619 import re seperator = "-" idiomStr = """高高兴兴 快快乐乐 快乐至上 欢欢喜喜 欢天喜地 一心一意 三心二意 一泻千里 三番五次 一鼓作气 以一敌万 鸡鸣狗盗 狐假虎威 兔死狐悲 狗急跳墙 """ def printSeperatorLine(curTitle): print("%s %s %s" % (seperator*30, curTitle , seperator*30)) def printEachMatchGroup(someIter): for curIdx, eachMatch in enumerate(someIter): curNum = curIdx + 1 # print("eachMatch=%s" % eachMatch) eachMatchWholeStr = eachMatch.group(0) print("[%d] %s" % (curNum, eachMatchWholeStr)) printSeperatorLine("xxyy模式成语") # xxyyP = "(\S)\1(\S)\2" # xxyyP = "(\S)\1(\S)\2" # xxyyP = "(\S)\1" # xxyyP = "(.)\1" # xxyyP = "(.)\\1" # xxyyP = "(?:P\S)\\1(\S)\\2" xxyyP = "(\S)\\1(\S)\\2" # foundAllXxyy = re.findall(xxyyP, idiomStr, re.S) # foundAllXxyy = re.search(xxyyP, idiomStr, re.S) # foundAllXxyyIter = re.finditer(xxyyP, idiomStr, re.S) foundAllXxyyIter = re.finditer(xxyyP, idiomStr) # print("foundAllXxyy=%s" % foundAllXxyy) # for curIdx, eachMatch in enumerate(foundAllXxyyIter): # curNum = curIdx + 1 # # print("eachMatch=%s" % eachMatch) # eachMatchWholeStr = eachMatch.group(0) # print("[%d] %s" % (curNum, eachMatchWholeStr) printEachMatchGroup(foundAllXxyyIter) # print("%s %s %s" % (seperator*30, "数字模式成语" , seperator*30)) printSeperatorLine("数字模式成语") # refer: # 个,十,百,千,万……兆 后面是什么?-作业-慧海网 # https://www.ajpsp.com/zuoye/4174539 zhcnDigitList = [ "一", "二", "三", "四", "五", "六", "七", "八", "九", "十", "百", "千", "万", "亿", "兆", "京", "垓", "秭", "穰", "沟", "涧", "正", "载", ] zhcnDigitListGroup = "|".join(zhcnDigitList) zhcnDigitP = "(%s)\S(%s)\S" % (zhcnDigitListGroup, zhcnDigitListGroup) zhcnDigitIter = re.finditer(zhcnDigitP, idiomStr, re.S) printEachMatchGroup(zhcnDigitIter) printSeperatorLine("动物模式成语") animalList = [ "鸡", "鸭", "猫", "狗", "猪", "兔", "狐", "狼", "虎", "豹", "狮", # TODO:添加更多常见动物 ] animalGroup = "|".join(animalList) animalP = "(%s)\S(%s)\S" % (animalGroup, animalGroup) animalIter = re.finditer(animalP, idiomStr, re.S) printEachMatchGroup(animalIter) """ 附录: 官网文档: python re re — Regular expression operations — Python 3.8.3 documentation https://docs.python.org/3/library/re.html#re.finditer """
其中:
- 关于把成语都放到文件中
- 你自己把idiomStr的内容放到文件,再从文件中读入内容,即可
输出:
------------------------------ xxyy模式成语 ------------------------------ [1] 高高兴兴 [2] 快快乐乐 [3] 欢欢喜喜 ------------------------------ 数字模式成语 ------------------------------ [1] 一心一意 [2] 三心二意 [3] 一泻千里 [4] 三番五次 ------------------------------ 动物模式成语 ------------------------------ [1] 鸡鸣狗盗 [2] 狐假虎威 [3] 兔死狐悲
即可实现:
用正则去匹配对应的特定模式的成语了。
转载请注明:在路上 » 【已解决】用Python的正则re去匹配特定模式的成语