我正在做词性标注。我是Spacy的新手。我收到这样一个错误。
AttributeError: 'spacy.tokens.doc.Doc' object has no attribute 'pos_'
我已经检查了数据类型是字符串,所以代码应该可以工作。我哪里弄错了?
完整的代码就在上面。
import pandas as pd
df = pd.read_excel('combined_file.xlsx', engine='openpyxl', index_col=None)
import spacy
df['body_string'] = df.body.astype('string')
sp = spacy.load('en_core_web_sm')
doc = df["body_string"].apply(sp)
for word in doc:
print(word.text, word.pos_, word.dep_)指向excel文件的链接在此处:https://www.dropbox.com/scl/fi/43nu0yf45obbyzprzc86n/combined_file.xlsx?dl=0&rlkey=7j959kz0urjxflf6r536brppt
发布于 2021-03-18 04:01:15
你需要称每个型号为,而不是一系列型号,例如
import pandas as pd
import spacy
nlp = spacy.load('en_core_web_sm')
df = pd.read_excel(r"<location of xlsx>")
docs = df['body'].apply(nlp)
for token in docs[0]:
print(token.text, token.pos_, token.dep_)文档/模型0的输出:
I PRON nsubj
love VERB ROOT
ememis ADV advmod
but CCONJ cc
... PUNCT punct
this DET nsubj
is AUX ROOT
probably ADV advmod
the DET det
worst ADJ amod
and CCONJ cc
most ADV advmod
useless ADJ conj
eye NOUN compound
serum NOUN attr
i PRON nsubj
ve AUX aux
ever ADV advmod
used VERB relcl
. PUNCT punct
Ever ADV advmod
a DET det
cheap ADJ amod
£ SYM quantmod
5 NUM compound
one NUM nsubj
from ADP prep
boots NOUN pobj
is AUX ROOT
better ADJ acomp如果你想打印出一些其他的doc.model (比如第二个):
for token in docs[1]:
print(token.text, token.pos_, token.dep_)基本上,docs是一个包含spacy应用模型的系列。例如,如果你想打印出所有的令牌等,你可以这样做(我不建议你这样做):
for doc in docs:
for token in doc:
print(token.text, token.pos_, token.dep_)https://stackoverflow.com/questions/66680209
复制相似问题