我必须在存储库中找到参考文档和文档集之间的相似性。
Method :
1. I find the term document matrix for all the documents including the reference document
2. The svd is calculated for this matrix
3. I take the v array(The third result)
4. I transpose this matrix so that the each row represents a document .
5. The first row represents the reference document .
6. I find the cosine similarity beween this row and the rest of the rows 我的怀疑是:
我使用java来编写这个代码。我利用jama软件包找到svd。
发布于 2012-01-27 05:33:30
您可以阅读LSA 这里的示例。
https://stackoverflow.com/questions/9028417
复制相似问题