我对拥抱面很陌生,我正在研究Flair (NER)模块,它给了我下面的输出:
from flair.data import Sentence
from flair.models import SequenceTagger
# load tagger
tagger = SequenceTagger.load("flair/ner-german-large")
# make example sentence
sentence = Sentence("George Washington ging nach Washington")
# predict NER tags
tagger.predict(sentence)
# print sentence
print(sentence)
# print predicted NER spans
print('The following NER tags are found:')
# iterate over entities and print
for entity in sentence.get_spans('ner'):
print(entity)输出
Span [1,2]: "George Washington" [− Labels: PER (1.0)]
Span [5]: "Washington" [− Labels: LOC (1.0)]如何将此输出转换为dataframe,可能的列为“Token”(NER)和“Token_Type”(“ORG”或“PER”)。
生成的sentence类型为data.sentence。
发布于 2022-11-21 10:18:25
代码中的entity部分是flair.data.Span类型,有很多属性可以使用(您可以在https://github.com/flairNLP/flair/blob/master/flair/data.py上看到Span类的源代码)。
import pandas as pd
entities = []
for entity in sentence.get_spans('ner'):
entities.append({
'text': entity.text,
'type': entity.tag,
'score': entity.score
})
print(pd.DataFrame(entities))
>>>
text type score
0 George Washington PER 0.999997
1 Washington LOC 0.999996https://stackoverflow.com/questions/74508539
复制相似问题