看到别人写代码,对于字符串列表,用到了set:
<code>gVocabularyList = set(line.strip() for line in open("vocabulary.txt")) </code>
而不是list
所以去查查,两者区别
python set vs list
performance – Python Sets vs Lists – Stack Overflow
8.7. sets — Unordered collections of unique elements — Python 2.7.15 documentation
Python Programming/Tuples and Sets – Wikiversity
然后需要去搞清楚:
如何初始化set
如何给set添加元素
set的个数
set特点:
无序的
唯一的
适用于:
成员测试,是否存在
去除重复项
常见数学计算
intersection交集
union联合
difference差异
symmetric difference
和其他集合类似,支持如下操作:
x in set
len(set)
for x in set
不支持索引indexing,切片slicing
很适合此处的单词表的场景
python set初始化
每天学点Python之set · Python大法好 · 看云
python 集合set的创建,更改,遍历,元算合并,交集,补集 – CSDN博客
【总结】
然后去写代码
<code>gVocabularySet = set([]) </code>
if stripedLowerWord not in gVocabularySet:
saveFilterOut(stripedLowerWord, sentence)
return False
def initgVocabularySet(connection):
“””init vocabulary set”””
global gVocabularySet
# NEW: get vocabulary from mysql ‘thesaurus’ table
getVocabularySql = “SELECT * FROM `%s`” % (VocabularyTableName)
logging.info(“getVocabularySql=%s”, getVocabularySql)
getVocabularyOk, resultDict = connection.executeSql(getVocabularySql)
logging.info(“getVocabularyOk=%s, resultDict=%s”, getVocabularyOk, resultDict)
if getVocabularyOk:
vocabularyRecordList = resultDict[“data”]
for eachRecord in vocabularyRecordList:
wordName = eachRecord[“name”]
wordName = wordName.lower()
# gVocabularyList.append(wordName)
gVocabularySet.add(wordName)
logging.info(“gVocabularySet=%s”, gVocabularySet)
vocabularySetLen = len(gVocabularySet)
logging.info(“vocabularySetLen=%s”, vocabularySetLen)
else:
logging.error(“Get vocabulary failed of sql: %s”, getVocabularySql)
总的来说:
set
适用于检测某元素是否在集合内、对集合进行一定的数学操作
不支持indexing,slicing
list
普通的数组
支持indexing,slicing
转载请注明:在路上 » 【已解决】Python中把list换成set