我正在经历一些我无法理解的奇怪的行为。随附下列“模式”文件的集合:
{
tag : ["t:someTag", "A", "B", "C"]
msg : "some message"
timestamp : ISODate(...)
someIntField: 1
}数组中的标记,以元素“t: string”开头,后面跟着任意数量的字符串标记。收集统计数据:
db.perf_multikey.stats()
{
"ns" : "test.perf_multikey",
"count" : 36239306,
"size" : 22124848112,
"avgObjSize" : 610,
"storageSize" : 24330923904,
"numExtents" : 32,
"nindexes" : 4,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 1,
"totalIndexSize" : 17494579648,
"indexSizes" : {
"_id_" : 1177303120,
"tag_1" : 12851094032,
"timestamp_1" : 1800706768,
"level_1" : 1665475728
},
"ok" : 1
}我正在执行以下查询:
db.perf_multikey.find({tag: {$all:["t:a", "J"]}})正如预期的那样,它命中索引并返回几行:
db.perf_multikey.find({tag: {$all:["t:a", "J"]}}).explain()
{
"cursor" : "BtreeCursor tag_1",
"isMultiKey" : true,
"n" : 6,
"nscannedObjects" : 10,
"nscanned" : 10,
"nscannedObjectsAllPlans" : 10,
"nscannedAllPlans" : 22,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 1,
"nChunkSkips" : 0,
"millis" : 7,
"indexBounds" : {
"tag" : [
[
"t:a",
"t:a"
]
]
},
"server" : "somefancyserver:27017",
"filterSet" : false
}但是,仅按标记数组中元素的顺序不同的查询。
db.perf_multikey.find({tag: {$all:["J","t:a"]}})似乎没有使用索引
db.perf_multikey.find({tag: {$all:["J","t:a"]}}).explain()
{
"cursor" : "Complex Plan",
"n" : 6,
"nscannedObjects" : 0,
"nscanned" : 7866684,
"nscannedObjectsAllPlans" : 7827833,
"nscannedAllPlans" : 15694517,
"nYields" : 139716,
"nChunkSkips" : 0,
"millis" : 118102,
"server" : "samefancyserver:27017",
"filterSet" : false
}看到上述结果,我使用MongoDB 2.6.9,我搞不懂MongoDB多键索引是如何工作的。为什么使用数组的查询如此依赖于顺序?
编辑:
升级到MongoDB 3.0.2之后,我重新生成了数据集(其大小足够大以至于索引不适合内存)并重新运行测试。不幸的是,我仍然遇到同样的结果(请注意,标记字段遵循某种'schema‘-数组的第一个元素是任意字符串,然后是标记的一些排列-从有限的值宇宙,例如"A“- "J")。
这是我的研究结果:
闪电快:
> db.perf_multikey.find({tag : {$all : ["a", "J"]}}).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.perf_multikey",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"tag" : {
"$eq" : "a"
}
},
{
"tag" : {
"$eq" : "J"
}
}
]
},
"winningPlan" : {
"stage" : "KEEP_MUTATIONS",
"inputStage" : {
"stage" : "FETCH",
"filter" : {
"tag" : {
"$eq" : "J"
}
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"tag" : 1
},
"indexName" : "tag_1",
"isMultiKey" : true,
"direction" : "forward",
"indexBounds" : {
"tag" : [
"[\"a\", \"a\"]"
]
}
}
}
},
"rejectedPlans" : [
{
"stage" : "FETCH",
"inputStage" : {
"stage" : "KEEP_MUTATIONS",
"inputStage" : {
"stage" : "AND_SORTED",
"inputStages" : [
{
"stage" : "IXSCAN",
"keyPattern" : {
"tag" : 1
},
"indexName" : "tag_1",
"isMultiKey" : true,
"direction" : "forward",
"indexBounds" : {
"tag" : [
"[\"a\", \"a\"]"
]
}
},
{
"stage" : "IXSCAN",
"keyPattern" : {
"tag" : 1
},
"indexName" : "tag_1",
"isMultiKey" : true,
"direction" : "forward",
"indexBounds" : {
"tag" : [
"[\"J\", \"J\"]"
]
}
}
]
}
}
}
]
},
"serverInfo" : {
"host" : "fancyhost",
"port" : 27017,
"version" : "3.0.2",
"gitVersion" : "6201872043ecbbc0a4cc169b5482dcf385fc464f"
},
"ok" : 1
}慢一点:
> db.perf_multikey.find({tag : {$all : ["J", "a"]}}).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.perf_multikey",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"tag" : {
"$eq" : "J"
}
},
{
"tag" : {
"$eq" : "a"
}
}
]
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "KEEP_MUTATIONS",
"inputStage" : {
"stage" : "AND_SORTED",
"inputStages" : [
{
"stage" : "IXSCAN",
"keyPattern" : {
"tag" : 1
},
"indexName" : "tag_1",
"isMultiKey" : true,
"direction" : "forward",
"indexBounds" : {
"tag" : [
"[\"J\", \"J\"]"
]
}
},
{
"stage" : "IXSCAN",
"keyPattern" : {
"tag" : 1
},
"indexName" : "tag_1",
"isMultiKey" : true,
"direction" : "forward",
"indexBounds" : {
"tag" : [
"[\"a\", \"a\"]"
]
}
}
]
}
}
},
"rejectedPlans" : [
{
"stage" : "KEEP_MUTATIONS",
"inputStage" : {
"stage" : "FETCH",
"filter" : {
"tag" : {
"$eq" : "a"
}
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"tag" : 1
},
"indexName" : "tag_1",
"isMultiKey" : true,
"direction" : "forward",
"indexBounds" : {
"tag" : [
"[\"J\", \"J\"]"
]
}
}
}
}
]
},
"serverInfo" : {
"host" : "fancyhost",
"port" : 27017,
"version" : "3.0.2",
"gitVersion" : "6201872043ecbbc0a4cc169b5482dcf385fc464f"
},
"ok" : 1
}我认为http://docs.mongodb.org/manual/reference/operator/query/all/#performance可能是答案。
毕竟,通过“随机字符串”查询,"A“使用”随机字符串“将潜在的结果集缩小到非常小的大小,从而便于扫描(?)或进一步穿越)。另一方面,查询通过"A",“随机字符串”应该是缓慢的,因为"A“将返回巨大的集进一步扫描.但质疑"A“、”随机不存在的字符串“是闪电般的速度.这让我很困惑。
发布于 2015-05-06 10:54:05
我强烈建议升级。我在2.6.1和3.0.0上测试了这一点,但我不了解这种行为。
例如,这里是2.6.1:
> db.t.find({tags:{$all:['t', 'tags']}}).explain()
{
"cursor" : "BtreeCursor tags_1",
"isMultiKey" : true,
"n" : 1,
"nscannedObjects" : 1,
"nscanned" : 1,
"nscannedObjectsAllPlans" : 1,
"nscannedAllPlans" : 3,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 7,
"indexBounds" : {
"tags" : [
[
"t",
"t"
]
]
},
"server" : "ubuntu:27017",
"filterSet" : false
}
> db.t.find({tags:{$all:['t', 'tags']}})
{ "_id" : ObjectId("5549f186450548aed9ad4273"), "tags" : [ "tags", "t" ] }即使有一个不存在的价值,首先:
> db.t.find({tags:{$all:['f', 'tags']}}).explain()
{
"cursor" : "BtreeCursor tags_1",
"isMultiKey" : true,
"n" : 0,
"nscannedObjects" : 0,
"nscanned" : 0,
"nscannedObjectsAllPlans" : 0,
"nscannedAllPlans" : 0,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"tags" : [
[
"f",
"f"
]
]
},
"server" : "ubuntu:27017",
"filterSet" : false
}关于3.0.0:
> db.t.find({tags:{$all:['g','t']}}).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.t",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"tags" : {
"$eq" : "g"
}
},
{
"tags" : {
"$eq" : "t"
}
}
]
},
"winningPlan" : {
"stage" : "KEEP_MUTATIONS",
"inputStage" : {
"stage" : "FETCH",
"filter" : {
"tags" : {
"$eq" : "t"
}
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"tags" : 1
},
"indexName" : "tags_1",
"isMultiKey" : true,
"direction" : "forward",
"indexBounds" : {
"tags" : [
"[\"g\", \"g\"]"
]
}
}
}
},
"rejectedPlans" : [
{
"stage" : "FETCH",
"inputStage" : {
"stage" : "KEEP_MUTATIONS",
"inputStage" : {
"stage" : "AND_SORTED",
"inputStages" : [
{
"stage" : "IXSCAN",
"keyPattern" : {
"tags" : 1
},
"indexName" : "tags_1",
"isMultiKey" : true,
"direction" : "forward",
"indexBounds" : {
"tags" : [
"[\"g\", \"g\"]"
]
}
},
{
"stage" : "IXSCAN",
"keyPattern" : {
"tags" : 1
},
"indexName" : "tags_1",
"isMultiKey" : true,
"direction" : "forward",
"indexBounds" : {
"tags" : [
"[\"t\", \"t\"]"
]
}
}
]
}
}
}
]
},
"serverInfo" : {
"host" : "ip-172-30-0-35",
"port" : 27017,
"version" : "3.0.0",
"gitVersion" : "a841fd6394365954886924a35076691b4d149168"
},
"ok" : 1
}当然,我没有在2.6.9上进行测试,但是我已经测试了越来越低的版本,并且我不能复制这种行为。
https://stackoverflow.com/questions/30072641
复制相似问题