我有以下问题。我一直试图从这个源代码复制示例代码:Github
我在Linux和Spacy 3.1上使用木星实验室环境
# $ pip install spacy-transformers
# $ python -m spacy download en_trf_bertbaseuncased_lg
import spacy
nlp = spacy.load("en_trf_bertbaseuncased_lg")
apple1 = nlp("Apple shares rose on the news.")
apple2 = nlp("Apple sold fewer iPhones this quarter.")
apple3 = nlp("Apple pie is delicious.")
# sentence similarity
print(apple1.similarity(apple2)) #0.69861203
print(apple1.similarity(apple3)) #0.5404963
# sentence embeddings
apple1.vector # or apple1.tensor.sum(axis=0)我正在使用Spacy 3.1,所以我改变了
python -m spacy download en_trf_bertbaseuncased_lg
至
python -m spacy download en_core_web_trf
现在我装上
nlp = spacy.load("en_trf_bertbaseuncased_lg")
使用
nlp = spacy.load("en_core_web_trf")
现在完整的代码如下所示
import spacy
nlp = spacy.load("en_core_web_trf")
apple1 = nlp("Apple shares rose on the news.")
apple2 = nlp("Apple sold fewer iPhones this quarter.")
apple3 = nlp("Apple pie is delicious.")
# sentence similarity
print(apple1.similarity(apple2)) #0.69861203
print(apple1.similarity(apple3)) #0.5404963
# sentence embeddings
apple1.vector # or apple1.tensor.sum(axis=0)但是,在运行代码时,我的输出不是:
#0.69861203 #0.5404963
变得简单
#0.0 #0.0
我还得到了以下UserWarinig:
<ipython-input-30-ed0c29210d4e>:8: UserWarning: [W007] The model you're using has no word vectors loaded, so the result of the Doc.similarity method will be based on the tagger, parser and NER, which may not give useful similarity judgements. This may happen if you're using one of the small models, e.g. `en_core_web_sm`, which don't ship with word vectors and only use context-sensitive tensors. You can always add your own word vectors, or use one of the larger models instead if available.
print(apple1.similarity(apple2)) #0.69861203
<ipython-input-30-ed0c29210d4e>:8: UserWarning: [W008] Evaluating Doc.similarity based on empty vectors.
print(apple1.similarity(apple2)) #0.69861203
<ipython-input-30-ed0c29210d4e>:9: UserWarning: [W007] The model you're using has no word vectors loaded, so the result of the Doc.similarity method will be based on the tagger, parser and NER, which may not give useful similarity judgements. This may happen if you're using one of the small models, e.g. `en_core_web_sm`, which don't ship with word vectors and only use context-sensitive tensors. You can always add your own word vectors, or use one of the larger models instead if available.
print(apple1.similarity(apple3)) #0.5404963
<ipython-input-30-ed0c29210d4e>:9: UserWarning: [W008] Evaluating Doc.similarity based on empty vectors.
print(apple1.similarity(apple3)) #0.5404963有人知道如何修正这段代码来正确计算相似度吗?
https://stackoverflow.com/questions/71977955
复制相似问题