我已经为文本注释设置了一个本地环境,并希望使用在这里开发的INCEpTION应用程序:https://github.com/inception-project/inception/blob/main/CONTRIBUTORS.txt
当试图连接到我的存储库时,我可以使用下面的示例来连接和查找文档:https://inception-project.github.io/releases/22.1/docs/user-guide.html#sect_external-search-repos
但是,当试图连接到使用FSCrawler创建和索引的存储库时,我无法让搜索开始工作。
其示例的映射如下:
{
"mappings": {
"_doc": {
"properties": {
"doc": {
"properties": {
"text": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"metadata": {
"properties": {
"language": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"source": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"timestamp": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"uri": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}我的索引映射是:
{
"mappings": {
"_doc": {
"dynamic_templates": [
{
"raw_as_text": {
"path_match": "meta.raw.*",
"mapping": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
}
}
}
],
"properties": {
"attachment": {
"type": "binary"
},
"attributes": {
"properties": {
"group": {
"type": "keyword"
},
"owner": {
"type": "keyword"
}
}
},
"content": {
"type": "text"
},
"file": {
"properties": {
"checksum": {
"type": "keyword"
},
"content_type": {
"type": "keyword"
},
"created": {
"type": "date",
"format": "dateOptionalTime"
},
"extension": {
"type": "keyword"
},
"filename": {
"type": "keyword",
"store": true
},
"filesize": {
"type": "long"
},
"indexed_chars": {
"type": "long"
},
"indexing_date": {
"type": "date",
"format": "dateOptionalTime"
},
"last_accessed": {
"type": "date",
"format": "dateOptionalTime"
},
"last_modified": {
"type": "date",
"format": "dateOptionalTime"
},
"url": {
"type": "keyword",
"index": false
}
}
},
"meta": {
"properties": {
"altitude": {
"type": "text"
},
"author": {
"type": "text"
},
"comments": {
"type": "text"
},
"contributor": {
"type": "text"
},
"coverage": {
"type": "text"
},
"created": {
"type": "date",
"format": "dateOptionalTime"
},
"creator_tool": {
"type": "keyword"
},
"date": {
"type": "date",
"format": "dateOptionalTime"
},
"description": {
"type": "text"
},
"format": {
"type": "text"
},
"identifier": {
"type": "text"
},
"keywords": {
"type": "text"
},
"language": {
"type": "keyword"
},
"latitude": {
"type": "text"
},
"longitude": {
"type": "text"
},
"metadata_date": {
"type": "date",
"format": "dateOptionalTime"
},
"modifier": {
"type": "text"
},
"print_date": {
"type": "date",
"format": "dateOptionalTime"
},
"publisher": {
"type": "text"
},
"rating": {
"type": "byte"
},
"relation": {
"type": "text"
},
"rights": {
"type": "text"
},
"source": {
"type": "text"
},
"title": {
"type": "text"
},
"type": {
"type": "text"
}
}
},
"path": {
"properties": {
"real": {
"type": "keyword",
"fields": {
"fulltext": {
"type": "text"
},
"tree": {
"type": "text",
"analyzer": "fscrawler_path",
"fielddata": true
}
}
},
"root": {
"type": "keyword"
},
"virtual": {
"type": "keyword",
"fields": {
"fulltext": {
"type": "text"
},
"tree": {
"type": "text",
"analyzer": "fscrawler_path",
"fielddata": true
}
}
}
}
}
}
}
}
}我可以使用标准_search从其他任何地方很好地搜索这两个存储库,并匹配"content“对象。

{
"metadata": {
"language": "en",
"source": "My favourite document collection",
"timestamp": "2011/11/11 11:11",
"uri": "http://the.internet.com/my/document/collection/document1.txt",
"title": "Cool Document Title"
},
"doc": {
"text": "This is a test Document"
}
}即使在将示例1级别向上移动时,该查询也适用于该示例。
{
"metadata": {
"language": "en",
"source": "My favourite document collection",
"timestamp": "2011/11/11 11:11",
"uri": "http://the.internet.com/my/document/collection/document1.txt",
"title": "Cool Document Title"
},
"doc": "This is a test Document"
}
}为了访问映射的“内容”对象,我需要在下面指定哪个对象?

克里斯
发布于 2022-01-23 08:46:15
不幸的是,这是一个功能问题。
映射必须非常具体才能使用这一点,除非文档映射非常具体,否则重新映射(更改fscrawler映射)不起作用。
简单地更改字段和类型是行不通的。
https://stackoverflow.com/questions/70810725
复制相似问题