我一直在开发一种功能,使用以下片段从数据集或示例中获取同义词,并希望打印概念和相关的同义词,但实际输出是打印两次,这使它难以理解。
代码:
for rt in self.raw_tokens:
concept = None
if rt.startswith('<'):
# if it's a concept (entity):
if 'value' in ElementTree.fromstring(rt).attrib:
string_token = ElementTree.fromstring(rt).text
concept = ElementTree.fromstring(rt).tag
# if it's a synonym, just use the text and strip the tag from the sample:
else:
string_token = ElementTree.fromstring(rt).text
#print('Token: [' + rt + ']')
print("Concepts & Synonyms are present in the sample :<%s> = %s" %(ElementTree.fromstring(rt).tag,string_token))
#print('CONCEPT: ' + concept)
else:
string_token = rt
toks.append((concept, string_token))Expected output :
Concepts & Synonyms are present in the sample :<<I_LOVE>I like</I_LOVE>> = I like
Concepts & Synonyms are present in the sample :<<I_WANTS>NEED</I_WANTS>> = NEED
Concepts & Synonyms are present in the sample :<<NEW_YORK>NEW YORK</NEW_YORK>> = NEW YORK
Concepts & Synonyms are present in the sample :<<I_WANTS>wish</I_WANTS>> = wish
Concepts & Synonyms are present in the sample :<<NEW_YORK>BIG APPLE</NEW_YORK>> = BIG APPLECurrent :
Concepts & Synonyms are present in the sample :<<I_LOVE>I like</I_LOVE>> = I like
Concepts & Synonyms are present in the sample :<<I_WANTS>NEED</I_WANTS>> = NEED
Concepts & Synonyms are present in the sample :<<NEW_YORK>NEW YORK</NEW_YORK>> = NEW YORK
Concepts & Synonyms are present in the sample :<<I_WANTS>wish</I_WANTS>> = wish
Concepts & Synonyms are present in the sample :<<NEW_YORK>BIG APPLE</NEW_YORK>> = BIG APPLE
Concepts & Synonyms are present in the sample :<<I_LOVE>I like</I_LOVE>> = I like
Concepts & Synonyms are present in the sample :<<I_WANTS>NEED</I_WANTS>> = NEED
Concepts & Synonyms are present in the sample :<<NEW_YORK>NEW YORK</NEW_YORK>> = NEW YORK
Concepts & Synonyms are present in the sample :<<I_WANTS>wish</I_WANTS>> = wish
Concepts & Synonyms are present in the sample :<<NEW_YORK>BIG APPLE</NEW_YORK>> = BIG APPLE有什么建议可以让它打印独特的样本吗?
发布于 2021-11-23 09:51:44
这就是我所说的使用Set的意思
toks = set()
for rt in self.raw_tokens:
concept = None
if rt.startswith('<'):
# if it's a concept (entity):
if 'value' in ElementTree.fromstring(rt).attrib:
string_token = ElementTree.fromstring(rt).text
concept = ElementTree.fromstring(rt).tag
# if it's a synonym, just use the text and strip the tag from the sample:
else:
string_token = ElementTree.fromstring(rt).text
else:
string_token = rt
if not (concept, string_token) in toks:
print("Concepts & Synonyms are present in the sample :<%s> = %s" %(concept,string_token))
toks.add((concept, string_token))https://stackoverflow.com/questions/70078291
复制相似问题