问从Lucene8索引中提取所有字段
EN

Stack Overflow用户

提问于 2020-06-01 16:40:48

回答 1查看 126关注 0票数 0

给定一个使用Lucene-8创建的索引，但不了解所使用的field，我如何以编程方式提取所有字段？(我知道Luke浏览器可以交互地使用(感谢@andrewjames) Examples for using latest version of Lucene。)场景是，在开发阶段，我必须在没有规定模式的情况下读取索引。我正在使用

IndexReader reader = DirectoryReader.open(FSDirectory.open(Paths.get(index)));
IndexSearcher searcher = new IndexSearcher(reader);

reader具有以下方法：

reader.getDocCount(field);

但这需要提前了解领域。

我知道索引中的文档可能会使用不同的字段进行索引；我已经准备好遍历所有文档并定期提取字段(这些索引不是很大)。

我使用的是Lucene 8.5.*，因此基于早期Lucene版本的post和教程可能无法工作。

java

lucene

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-06-03 02:16:41

您可以通过如下方式获取基本字段信息：

import java.util.List;
import java.io.IOException;
import java.nio.file.Paths;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexableField;
import org.apache.lucene.store.FSDirectory;

public class IndexDataExplorer {

    private static final String INDEX_PATH = "/path/to/index/directory";

    public static void doSearch() throws IOException {
        IndexReader reader = DirectoryReader.open(FSDirectory.open(Paths.get(INDEX_PATH)));
        for (int i = 0; i < reader.numDocs(); i++) {
            Document doc = reader.document(i);
            List<IndexableField> fields = doc.getFields();
            for (IndexableField field : fields) {
                // use these to get field-related data:
                //field.name();
                //field.fieldType().toString();
            }
        }
    }
}

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/62128466

复制

相似问题

问从Lucene8索引中提取所有字段
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从Lucene8索引中提取所有字段EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从Lucene8索引中提取所有字段
EN