我试图训练en_core_web_sm模型使用以下代码添加一个新的实体EMAIL:
LABEL = "EMAIL"
TRAIN_DATA = [
(
"My email address is XXXX@gmail.com",
{"entities": [(20, 37, LABEL)]},
),
("you can email me @ XXXXX@ai.xXx.com?", {"entities": [(19, 36, LABEL)]}),
(
"contact me @ XXXX@ai.xXX.com",
{"entities": [(13, 31, LABEL)]},
),
("you can contact me at xxXX@xxXXX.com", {"entities": [(22, 56, LABEL)]})
]
def main(model="en_core_web_sm", new_model_name="en_core_web_sm", output_dir="D:/Train_ai", n_iter=8):
random.seed(0)
if model is not None:
nlp = spacy.load('en_core_web_sm')
print("Loaded model '%s'" % model)
else:
nlp = spacy.blank("en")
print("Created blank 'en' model")
if "ner" not in nlp.pipe_names:
ner = nlp.create_pipe("ner")
nlp.add_pipe(ner)
else:
ner = nlp.get_pipe("ner")
ner.add_label(LABEL)
ner.add_label("VEGETABLE")
if model is None:
optimizer = nlp.begin_training()
else:
optimizer = nlp.resume_training() 我得到的错误是:
AttributeError:'English‘对象在
optimizer = nlp.resume_training()上没有属性’resume_training‘
发布于 2019-04-01 14:25:52
发布于 2020-08-24 01:30:33
正如@Ines所指出的,Spacy2.0.x不支持resume_training。但是,您仍然可以通过简单地替换这一行来恢复对检查点的培训:
optimizer = nlp.resume_training()用这一行:
optimizer = nlp.entity.create_optimizer()然后,当nlp.update()在最后一个实际开始训练的地方时,将这个传递给sgd param,如下所示:
nlp.update(
texts, # batch of texts
annotations, # batch of annotations
sgd=optimizer,
drop=0, # dropout - make it harder to memorise data
losses=losses,
)恢复培训可能有助于各种任务,如微调/添加新实体/在发生中断时重新启动培训。
https://stackoverflow.com/questions/55456861
复制相似问题