首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在另一个文件中查找一个文件中的记录的匹配项

在另一个文件中查找一个文件中的记录的匹配项
EN

Stack Overflow用户
提问于 2018-12-26 08:20:54
回答 1查看 251关注 0票数 0

我有一个包含单词的文件和另一个包含定义的“字典”文件。我想在字典中找到每个单词的释义,并将其写到一个文件中。

我在这里看了看,看到了一个使用Unix/Linux命令的答案,但我在windows上,决定用python解决,并提出了一个可行的解决方案,但我想知道是否有更好的方法。

代码语言:javascript
复制
with open('D:/words_and_definitions.txt', 'w') as fo:
    dict_file = open('D:/Oxford_English_Dictionary-orig.txt','r')
    word_file = open('D:/Words.txt','r')
    definitions = dict_file.readlines()
    words = word_file.readlines()
    count = 1;
    for word in words:
        findStatus='not_found'
        word = word.strip() + ' '
        for definition in definitions:
            if re.match(r''+word, definition) is None:
                count += 1
            else:
                fo.write(definition)
                findStatus='found'
                break
        if findStatus == 'not_found':
            fo.write(word+' ****************no definition' + '\n')
print("all done")

word_file不是按字母顺序排序的,dict_file是按字母顺序排序的。

来自word_file的示例

代码语言:javascript
复制
Inane
Relevant
Impetuous
Ambivalent
Dejected
Postmortem
Incriminate

来自dict_file的示例

代码语言:javascript
复制
Ambiguity -n. the condition of admitting of two or more meanings, of being understood in more than one way, or of referring to two or more things at the same time 
Ambiguous  adj. 1 having an obscure or double meaning. 2 difficult to classify.  ambiguity n. (pl. -ies). [latin ambi- both ways, ago drive]
Ambit  n. Scope, extent, or bounds. [latin: related to *ambience]
Ambition  n. 1 determination to succeed. 2 object of this. [latin, = canvassing: related to *ambience]
Ambitious  adj. 1 full of ambition or high aims. 2 (foll. By of, or to + infin.) Strongly determined.
Ambivalence  n. Coexistence of opposing feelings.  ambivalent adj. [latin ambo both, *equivalent]
Ambivalent adj. having opposing feelings, undecided
Amble  —v. (-ling) move at an easy pace. —n. Such a pace. [latin ambulo walk]
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-12-31 02:35:04

你有没有试过用字典来查找释义?当然,如果您的定义文件太大,您可能会有一些内存问题,但在您的情况下,它可能是足够的。这可以给出一个简单的解决方案:

代码语言:javascript
复制
import re

definition_finder = re.compile(r'^(\w+)\s+(.*)$')

with open('Oxford_English_Dictionary-orig.txt') as dict_file:
    definitions = {}
    for line in dict_file:
        definition_found = definition_finder.match(line)
        if definition_found:
            definitions[definition_found.group(1)] = definition_found.group(2)

with open('Words.txt') as word_file:
    with open('words_and_definitions.txt', 'w') as fo:
        input_lines = (line.strip("\n") for line in word_file)
        for line in input_lines:
            fo.write(f"{line} {definitions.get(line, '****************no definition')}\n")

你可以用一种更简洁的方式来定义你的定义。这将给出:

代码语言:javascript
复制
import re

definition_finder = re.compile(r'^(\w+)\s+(.*)$')

with open('Oxford_English_Dictionary-orig.txt') as dict_file:
    definitions_found = (definition_finder.match(line) for line in dict_file) 
    definitions = dict(definition_found.groups() for definition_found
                       in definitions_found if definition_found)

with open('Words.txt') as word_file:
    with open('words_and_definitions.txt', 'w') as fo:
        input_lines = (line.strip("\n") for line in word_file)
        for line in input_lines:
            fo.write(f"{line} {definitions.get(line, '****************no definition')}\n")

如果您的定义文件确实太大,那么您可以考虑,例如使用像sqlite3模块这样的数据库。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/53926482

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档