首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >基于知识的问答系统不能给出最合适的答案

基于知识的问答系统不能给出最合适的答案
EN

Stack Overflow用户
提问于 2012-03-23 21:35:24
回答 2查看 1.2K关注 0票数 3

我正在从事一个项目,这基本上是一个基于知识的问答系统。我的系统接受用户的查询,从维基百科下载相关文档,剥离所有的html标签,提取纯文本。在此之后,它将文档标记化为句子,然后形成术语-文档(TD)矩阵(查询也作为句子传递)。然后将该TD矩阵转发给概率潜在对称分析(PLSA)算法。然后,计算文档(句子)向量与查询向量之间的余弦相似度。基于与查询向量的相似度,将最相关的句子显示为答案。(词干分析也是在TD矩阵形成时完成的)。问题是is确实显示了结果,但不是最相关的。我哪里错了?我遵循的策略是否正确,或者是否存在任何其他算法可能对我有所帮助?下面我展示了我的系统返回的一些问题和答案:

代码语言:javascript
复制
What is photosynthesis?
ANSWER  1 :   The stroma contains stacks (grana) of thylakoids, which are the site of photosynthesis 

ANSWER  2 :   Factors leaf is the primary site of photosynthesis in plants 

ANSWER  3 :   Samuel Ruben and Martin Kamen used radioactive isotopes to determine that the oxygen liberated in photosynthesis came from the water 

ANSWER  4 :   In plants, algae and cyanobacteria, photosynthesis releases oxygen 

另一个问题

代码语言:javascript
复制
What is Artificial Intelligence?
ANSWER  1 :   the problem of creating 'artificial intelligence' will substantially be solved" 

ANSWER  2 :   37 The leading-edge definition of artificial intelligence research is changing over time 

ANSWER  3 :   Stories of these creatures and their fates discuss many of the same hopes, fears and ethical concerns that are presented by artificial intelligence 

ANSWER  4 :   History of artificial intelligence and Timeline of artificial intelligence Thinking machines and artificial beings appear in Greek myths , such as Talos of Crete , the bronze robot of Hephaestus , and Pygmalion's Galatea 13 Human likenesses believed to have intelligence were built in every major civilization 

另一个问题

代码语言:javascript
复制
Who is a hacker?

ANSWER  1 :   19 Hackers (short stories) Helba from the  

ANSWER  2 :   16 Rafael Núñez aka RaFa was a notorious most wanted hacker by the FBI since 2001 

ANSWER  3 :   Often, this type of 'white hat' hacker is called an ethical hacker 
ANSWER  4 :   Hackers also commonly use port scanners  

又一次奔跑

代码语言:javascript
复制
What is biology?
ANSWER  1 :   Molecular biology is the study of biology at a molecular level 

ANSWER  2 :   molecular biology studies the complex interactions of systems of biological molecules 

ANSWER  3 :   The similarities and differences between cell types are particularly relevant to molecular biology 

ANSWER  4 :   Contents History Foundations of modern biology 2 
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2012-03-23 22:03:04

我认为,如果你保持完全的统计方法,就很难改进你的系统。从统计NLP的角度来看,你确实做了正确的事情。现在,您可以微调一些参数。要做到这一点,你必须建立一个训练语料库,告诉系统哪个答案是正确的……然后看看参数必须取哪个值才能给出这个答案。

话虽如此,我认为微调参数不会提高你的准确率超过20% ~30%。

如果你想走得更远,你需要一种更具语义的方法,并用符号来表示知识。检查实例http://www.jfsowa.com/

票数 1
EN

Stack Overflow用户

发布于 2012-03-23 22:51:23

这是一个经过充分研究的问题,称为问答(QA)。我已经在another answer中提供了一个关于QA的摘要。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/9840136

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档