我们试图在Solr中查询索引嵌套的子文档,但是当我们查询时,例如,当我们查询返回一个子文档( event_id: order-1 )的父文档时,结果的父文档具有一个带有event_id: order-5的子文档。
我们使用Solr的示例数据设置了一个新的Solr,当查询该数据时,返回的结果是正确的。这个想法是,也许solrconfig.xml中有一些东西,但是在移除或将事物设置为默认之后,结果仍然是不正确的。
目前,我们正在检查schema.xml,以确定我们是否能够以这种方式纠正结果。
我们目前的solrconfig.xml
<config>
<luceneMatchVersion>8.11.2</luceneMatchVersion>
<directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.StandardDirectoryFactory}" />
<schemaFactory class="ClassicIndexSchemaFactory"/>
<indexConfig>
<lockType>single</lockType>
<ramBufferSizeMB>256</ramBufferSizeMB>
<mergePolicyFactory class="org.apache.solr.index.SortingMergePolicyFactory">
<str name="sort">id asc</str>
<str name="wrapped.prefix">inner</str>
<str name="inner.class">org.apache.solr.index.TieredMergePolicyFactory</str>
<int name="inner.maxMergeAtOnce">10</int>
<int name="inner.segmentsPerTier">10</int>
<int name="inner.deletesPctAllowed">20</int>
</mergePolicyFactory>
</indexConfig>
<updateHandler class="solr.DirectUpdateHandler2">
<autoCommit>
<maxDocs>1000000</maxDocs>
<maxSize>2g</maxSize>
<openSearcher>false</openSearcher>
</autoCommit>
<updateLog>
<str name="dir">${solr.data.dir:}</str>
</updateLog>
</updateHandler>
<query>
<maxBooleanClauses>102400</maxBooleanClauses>
<filterCache class="solr.CaffeineCache" maxRamMB="750" initialSize="0" autowarmCount="0" />
<queryResultCache class="solr.CaffeineCache" size="512" initialSize="0" autowarmCount="0" />
<fieldValueCache class="solr.CaffeineCache" size="1" initialSize="0" autowarmCount="0" />
<enableLazyFieldLoading>true</enableLazyFieldLoading>
<queryResultWindowSize>0</queryResultWindowSize>
<queryResultMaxDocsCached>200</queryResultMaxDocsCached>
<useColdSearcher>false</useColdSearcher>
<maxWarmingSearchers>2</maxWarmingSearchers>
</query>
<requestDispatcher handleSelect="false">
<requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="2048000" />
<httpCaching never304="true" />
</requestDispatcher>
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="df">text</str>
</lst>
</requestHandler>
<requestHandler name="/update" class="solr.UpdateRequestHandler"></requestHandler>
</config>我们目前的schema.xml
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="default-config" version="1.6">
<fieldType name="_nest_path_" class="solr.NestPathField" />
<!-- The StrField type is not analyzed, but indexed/stored verbatim. -->
<fieldType name="string" class="solr.StrField" sortMissingLast="true" docValues="true" />
<fieldType name="strings" class="solr.StrField" sortMissingLast="true" multiValued="true" docValues="true" />
<!-- boolean type: "true" or "false" -->
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" />
<fieldType name="booleans" class="solr.BoolField" sortMissingLast="true" multiValued="true" />
<!-- Numeric field types that index values using KD-trees. Point fields don't support FieldCache, so they must have docValues="true"
if needed for sorting, faceting, functions, etc. -->
<fieldType name="pint" class="solr.IntPointField" docValues="true" />
<fieldType name="pfloat" class="solr.FloatPointField" docValues="true" />
<fieldType name="plong" class="solr.LongPointField" docValues="true" />
<fieldType name="pdouble" class="solr.DoublePointField" docValues="true" />
<fieldType name="pints" class="solr.IntPointField" docValues="true" multiValued="true" />
<fieldType name="pfloats" class="solr.FloatPointField" docValues="true" multiValued="true" />
<fieldType name="plongs" class="solr.LongPointField" docValues="true" multiValued="true" />
<fieldType name="pdoubles" class="solr.DoublePointField" docValues="true" multiValued="true" />
<!-- KD-tree versions of date fields -->
<fieldType name="pdate" class="solr.DatePointField" docValues="true" />
<fieldType name="pdates" class="solr.DatePointField" docValues="true" multiValued="true" />
<uniqueKey>id</uniqueKey>
<!-- Solr automatically populates this with the value of the top/parent ID. E.g. the profile ID. It is required. -->
<field name="_root_" type="string" indexed="true" stored="false" docValues="false" />
<!-- Is populated by Solr automatically with the path of the document in the hierarchy for non-root documents. -->
<field name="_nest_path_" type="_nest_path_" />
<!-- Is populated by Solr automatically to store the ID of each document’s parent document (if there is one). -->
<field name="_nest_parent_" type="string" indexed="true" stored="true"/>
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<!-- docValues are enabled by default for long type so we don't need to index the version field -->
<field name="_version_" type="plong" indexed="false" stored="false" />
<field name="_indexversion_" type="pint" indexed="true" stored="false" multiValued="false" required="true"
default="4" />
<field name="timestamp" type="pdate" indexed="true" stored="false" default="NOW" />
<field name="content_type" type="string" indexed="true" stored="false" />
<!-- define system values, which are known to be single valued -->
<field name="creationdate_l" type="plong" indexed="true" stored="false" />
<field name="lastmodifieddate_l" type="plong" indexed="true" stored="false" />
<field name="firstvisit_l" type="plong" indexed="true" stored="false" />
<field name="lastvisit_l" type="plong" indexed="true" stored="false" />
<!-- behavioral properties -->
<field name="frequency_bp" type="pint" indexed="true" stored="false" />
<field name="intensity_bp" type="pint" indexed="true" stored="false" />
<field name="recent_intensity_bp" type="pfloat" indexed="true" stored="false" />
<field name="firstvisit_behavior_bp" type="pint" indexed="true" stored="false" />
<field name="lastvisit_behavior_bp" type="pint" indexed="true" stored="false" />
<!-- Profile meta data fields only have one value -->
<field name="propertycount_i" type="pint" indexed="true" stored="false" />
<field name="totalpropertycount_i" type="pint" indexed="true" stored="false" />
<field name="totalpropertysize_i" type="pint" indexed="true" stored="false" />
<field name="maxproperty_s" type="string" indexed="true" stored="false" />
<field name="maxpropertyvalues_i" type="pint" indexed="true" stored="false" />
<field name="system_has_property_s" type="strings" indexed="true" stored="false" />
<field name="sample_id_i" type="pint" indexed="true" stored="false" />
<field name="event_id" type="string" indexed="true" multiValued="false" stored="true" />
<field name="event_type_id" type="string" indexed="true" multiValued="false" stored="true" />
<field name="event_date" type="plong" indexed="true" multiValued="false" stored="true" />
<field name="event_profile_id" type="string" indexed="true" multiValued="false" stored="true" />
<dynamicField name="*_ordinal_i" type="pint" indexed="true" stored="false" />
<dynamicField name="*_i" type="pints" indexed="true" stored="false" />
<dynamicField name="*_l" type="plongs" indexed="true" stored="false" />
<dynamicField name="*_f" type="pfloats" indexed="true" stored="false" />
<dynamicField name="*_s" type="strings" indexed="true" stored="false" />
<dynamicField name="*_b" type="boolean" indexed="true" stored="false" />
<dynamicField name="momentum_bp_*" type="pint" indexed="true" stored="false" />
<dynamicField name="threshold_*" type="plong" indexed="true" multiValued="false" stored="false" />
<dynamicField name="firsttouch_*" type="plong" indexed="true" multiValued="false" stored="false" />
<dynamicField name="reentryrestricted_*" type="string" indexed="true" multiValued="false" stored="false"/>
<dynamicField name="exitentrancerestricted_*" type="string" indexed="true" multiValued="false" stored="false"/>
</schema>索引文件:
{
"id":"99c75c9a-b083-428d-baa1-6a9662c6eb72",
"name_s":"Profile 1",
"description_t":"test description",
"age_is":[28,
34],
"creationdate_l":1658990989645,
"content_type":"profile",
"_version_":1739600934763233280,
"_root_":"99c75c9a-b083-428d-baa1-6a9662c6eb72",
"timeline_events":
{
"id":"dcde9bfd-97ee-4d76-97d8-5297c1b2e87d",
"event_id":"order-0",
"event_type_id":"order",
"event_date":1658990989644,
"total_revenue_f":865.0,
"_nest_path_":"/timeline_events#",
"_nest_parent_":"99c75c9a-b083-428d-baa1-6a9662c6eb72",
"content_type":"timeline_event",
"_version_":1739600934763233280,
"_root_":"99c75c9a-b083-428d-baa1-6a9662c6eb72",
"product":[
{
"id":"9dabaac8-7651-4c56-9fb4-66d56b7175c3",
"name_s":"product-0",
"promotion_s":"NO",
"listprice_f":477.0,
"quantity_i":22,
"variant_ss":["handbags",
"men"],
"pages_i":1,
"_nest_path_":"/timeline_events#/product#0",
"_nest_parent_":"dcde9bfd-97ee-4d76-97d8-5297c1b2e87d",
"content_type":"order_product",
"_version_":1739600934763233280,
"_root_":"99c75c9a-b083-428d-baa1-6a9662c6eb72"}]}},
{
"id":"c19483e2-f940-403f-bb24-03adce1bcb02",
"name_s":"Profile 2",
"description_t":"test description for profile 2",
"age_is":[25,
40],
"creationdate_l":1658990989653,
"content_type":"profile",
"_version_":1739600934766379008,
"_root_":"c19483e2-f940-403f-bb24-03adce1bcb02",
"timeline_events":
{
"id":"dcde9bfd-97ee-4d76-97d8-5297c1b2e87d",
"event_id":"order-4",
"event_type_id":"order",
"event_date":1658990989649,
"total_revenue_f":952.0,
"_nest_path_":"/timeline_events#",
"_nest_parent_":"c19483e2-f940-403f-bb24-03adce1bcb02",
"content_type":"timeline_event",
"_version_":1739600934766379008,
"_root_":"c19483e2-f940-403f-bb24-03adce1bcb02",
"product":[
{
"id":"7a143554-b5f9-4487-b182-9938b91f76b4",
"name_s":"product-4",
"promotion_s":"YES",
"listprice_f":487.0,
"quantity_i":25,
"variant_ss":["junior",
"watches"],
"pages_i":1,
"_nest_path_":"/timeline_events#/product#0",
"_nest_parent_":"dcde9bfd-97ee-4d76-97d8-5297c1b2e87d",
"content_type":"order_product",
"_version_":1739600934766379008,
"_root_":"c19483e2-f940-403f-bb24-03adce1bcb02"}]}},
{
"id":"da88463c-fcca-4405-8656-0371809ccb28",
"name_s":"Profile 3",
"description_t":"test description for profile 3",
"age_is":[34,
39],
"creationdate_l":1658990989648,
"content_type":"profile",
"_version_":1739600934768476160,
"_root_":"da88463c-fcca-4405-8656-0371809ccb28",
"timeline_events":
{
"id":"61f47b18-15f4-4a4d-bb93-a4232dd22043",
"event_id":"order-2",
"event_type_id":"order",
"event_date":1658990989647,
"total_revenue_f":838.0,
"_nest_path_":"/timeline_events#",
"_nest_parent_":"da88463c-fcca-4405-8656-0371809ccb28",
"content_type":"timeline_event",
"_version_":1739600934768476160,
"_root_":"da88463c-fcca-4405-8656-0371809ccb28",
"product":[
{
"id":"1fc4616b-2629-4cc4-8a60-7238f97c9aae",
"name_s":"product-2",
"promotion_s":"YES",
"listprice_f":403.0,
"quantity_i":26,
"variant_ss":["pants",
"women"],
"pages_i":1,
"_nest_path_":"/timeline_events#/product#0",
"_nest_parent_":"61f47b18-15f4-4a4d-bb93-a4232dd22043",
"content_type":"order_product",
"_version_":1739600934768476160,
"_root_":"da88463c-fcca-4405-8656-0371809ccb28"}]}}]
}
}当我们执行以下查询时
{!parent which="*:* -_nest_path_:*"}event_id:order-0
OR
{!parent which="content_type:profile"}event_id:order-0对于本例,查询执行相同的操作,并且都返回相同的不正确结果。
{
"id":"da88463c-fcca-4405-8656-0371809ccb28",
"name_s":"Profile 3",
"description_t":"test description for profile 3",
"age_is":[34,
39],
"creationdate_l":1658990989648,
"content_type":"profile",
"_version_":1739600934768476160,
"_root_":"da88463c-fcca-4405-8656-0371809ccb28"
}这是不正确的,正确的反应是
{
"id":"99c75c9a-b083-428d-baa1-6a9662c6eb72",
"name_s":"Profile 1",
"description_t":"test description",
"age_is":[28,
34],
"creationdate_l":1658990989645,
"content_type":"profile",
"_version_":1739600934763233280,
"_root_":"99c75c9a-b083-428d-baa1-6a9662c6eb72"
}发布于 2022-07-29 17:02:40
经过更多的试验和错误之后,我们发现问题在于
<mergePolicyFactory class="org.apache.solr.index.SortingMergePolicyFactory">
<str name="sort">id asc</str>
<str name="wrapped.prefix">inner</str>
<str name="inner.class">org.apache.solr.index.TieredMergePolicyFactory</str>
<int name="inner.maxMergeAtOnce">10</int>
<int name="inner.segmentsPerTier">10</int>
<int name="inner.deletesPctAllowed">20</int>
</mergePolicyFactory>如果删除这个部分,结果是正确的。我们仍在进行进一步的调查,以确定到底出了什么问题。将继续更新线程,因为我们发现更多的细节。
https://stackoverflow.com/questions/73165803
复制相似问题