我试图通过向程序传递一个.txt文件,然后迭代所述txt文件,处理掉任何标点符号和不感兴趣的填充词,然后将处理后的字典传递给外部词云模块,来构建一个词云图像。我已经将file_contents设置为一个单词列表,并将它们拆分。然后迭代列表以用空字符串替换任何标点符号,然后将字典设置为迭代列表并将单词存储在所述字典中。一旦将结果存储到字典中,就会检查它们是否有填充词的列表,如果有匹配,则将它们替换为空字符串,然后返回dict值。我试了所有的方法,但仍然找不到我自己的问题所在。
def calculate_frequencies(file_contents):
# Here is a list of punctuations and uninteresting words you can use to process your text
punctuations = '''!()-[]{};:'"\,<>./?@#$%^&*_~'''
uninteresting_words = ["the", "a", "to", "if", "is", "it", "of", "and", "or", "an", "as", "i", "me", "my", \
"we", "our", "ours", "you", "your", "yours", "he", "she", "him", "his", "her", "hers", "its", "they", "them", \
"their", "what", "which", "who", "whom", "this", "that", "am", "are", "was", "were", "be", "been", "being", \
"have", "has", "had", "do", "does", "did", "but", "at", "by", "with", "from", "here", "when", "where", "how", \
"all", "any", "both", "each", "few", "more", "some", "such", "no", "nor", "too", "very", "can", "will", "just"]
# LEARNER CODE START HERE
words = [file_contents.split()]
frequencies = {}
for word in words:
if punctuations in words:
words.replace(punctuations, "")
frequencies = {word +1}
if uninteresting_words in frequencies:
frequencies.replace(uninteresting_words, "")
return frequencies
return words
#wordcloud
cloud = wordcloud.WordCloud()
cloud.generate_from_frequencies(frequencies)
return cloud.to_array()任何提示和指示都会有所帮助,谢谢
发布于 2021-03-31 08:31:35
我创造了一个例子..。这并不是一个完美的方案,但我认为它达到了您的要求。我提供了注释作为解释
def calculate_frequencies(file_contents):
# Here is a list of punctuations and uninteresting words you can use to process your text
punctuations = '''!()-[]{};:'"\,<>./?@#$%^&*_~'''
uninteresting_words = ["the", "a", "to", "if", "is", "it", "of", "and", "or", "an", "as", "i", "me", "my", \
"we", "our", "ours", "you", "your", "yours", "he", "she", "him", "his", "her", "hers", "its", "they", "them", \
"their", "what", "which", "who", "whom", "this", "that", "am", "are", "was", "were", "be", "been", "being", \
"have", "has", "had", "do", "does", "did", "but", "at", "by", "with", "from", "here", "when", "where", "how", \
"all", "any", "both", "each", "few", "more", "some", "such", "no", "nor", "too", "very", "can", "will", "just"]
# LEARNER CODE START HERE
# words = file_contents.split()
# split the file contents and convert all the words into lower case
words = [x.lower() for x in file_contents.split()]
# init a 'frequencies' dictionary
frequencies = {}
# loop through each word in the 'words' list
for i, word in enumerate(words):
# we only care when the word isn't in the uninteresting_words list
if word not in uninteresting_words:
# loop through each punctuation
for punc in punctuations:
# replacing the punctuation with ''
words[i] = words[i].replace(punc, '')
# add the word to 'frequencies'/increase the value stored
if words[i] in frequencies:
frequencies[words[i]] += 1
else:
frequencies[words[i]] = 1
return frequencies
print(calculate_frequencies("I am repeating repeating repeating."))https://stackoverflow.com/questions/66879134
复制相似问题