我有一个完整的集合,其中包含一个created_date属性的文档。我想通过聚合管道发送这些文档,以便对它们做一些工作。理想情况下,我希望在对它们做任何其他工作之前使用$match对它们进行过滤,以便能够利用索引,但是我不知道如何在$match表达式中使用新的$year/$month/$dayOfMonth运算符。
有一些关于如何在$project操作中使用运算符的示例,但是我担心,将$project作为管道中的第一步,那么我就失去了对索引的访问(MongoDB文档指出,第一个表达式必须是$match才能利用索引)。
示例数据:
{
post_body: 'This is the body of test post 1',
created_date: ISODate('2012-09-29T05:23:41Z')
comments: 48
}
{
post_body: 'This is the body of test post 2',
created_date: ISODate('2012-09-24T12:34:13Z')
comments: 10
}
{
post_body: 'This is the body of test post 3',
created_date: ISODate('2012-08-16T12:34:13Z')
comments: 10
}我想通过一个聚合管道来获取9月份发表的所有帖子的总评论
{
aggregate: 'posts',
pipeline: [
{$match:
/*Can I use the $year/$month operators here to match Sept 2012?
$year:created_date : 2012,
$month:created_date : 9
*/
/*or does this have to be
created_date :
{$gte:{$date:'2012-09-01T04:00:00Z'},
$lt: {$date:'2012-10-01T04:00:00Z'} }
*/
},
{$group:
{_id: '0',
totalComments:{$sum:'$comments'}
}
}
]
}这是可行的,但是匹配失去了对更复杂查询的任何索引的访问权:
{
aggregate: 'posts',
pipeline: [
{$project:
{
month : {$month:'$created_date'},
year : {$year:'$created_date'}
}
},
{$match:
{
month:9,
year: 2012
}
},
{$group:
{_id: '0',
totalComments:{$sum:'$comments'}
}
}
]
}发布于 2012-10-03 02:04:07
正如您已经发现的,您不能对不在文档中的字段进行查找(它的工作方式与$match的工作方式完全相同),如果您首先使用$project,那么您将失去使用索引的能力。
你可以做的是结合你的努力,如下所示:
{
aggregate: 'posts',
pipeline: [
{$match: {
created_date :
{$gte:{$date:'2012-09-01T04:00:00Z'},
$lt: {date:'2012-10-01T04:00:00Z'}
}}
}
},
{$group:
{_id: '0',
totalComments:{$sum:'$comments'}
}
}
]
}上面只给出了9月份的汇总,如果您想汇总多个月,例如:
{
aggregate: 'posts',
pipeline: [
{$match: {
created_date :
{ $gte:'2012-07-01T04:00:00Z',
$lt: '2012-10-01T04:00:00Z'
}
},
{$project: {
comments: 1,
new_created: {
"yr" : {"$year" : "$created_date"},
"mo" : {"$month" : "$created_date"}
}
}
},
{$group:
{_id: "$new_created",
totalComments:{$sum:'$comments'}
}
}
]
}然后你会得到类似这样的结果:
{
"result" : [
{
"_id" : {
"yr" : 2012,
"mo" : 7
},
"totalComments" : 5
},
{
"_id" : {
"yr" : 2012,
"mo" : 8
},
"totalComments" : 19
},
{
"_id" : {
"yr" : 2012,
"mo" : 9
},
"totalComments" : 21
}
],
"ok" : 1
}发布于 2012-10-03 01:29:20
试试这个;
db.createCollection("so");
db.so.remove();
db.so.insert([
{
post_body: 'This is the body of test post 1',
created_date: ISODate('2012-09-29T05:23:41Z'),
comments: 48
},
{
post_body: 'This is the body of test post 2',
created_date: ISODate('2012-09-24T12:34:13Z'),
comments: 10
},
{
post_body: 'This is the body of test post 3',
created_date: ISODate('2012-08-16T12:34:13Z'),
comments: 10
}
]);
//db.so.find();
db.so.ensureIndex({"created_date":1});
db.runCommand({
aggregate:"so",
pipeline:[
{
$match: { // filter only those posts in september
created_date: { $gte: ISODate('2012-09-01'), $lt: ISODate('2012-10-01') }
}
},
{
$group: {
_id: null, // no shared key
comments: { $sum: "$comments" } // total comments for all the posts in the pipeline
}
},
]
//,explain:true
});结果是;
{ "result" : [ { "_id" : null, "comments" : 58 } ], "ok" : 1 }所以你也可以修改你之前的例子来做这件事,尽管我不确定你为什么要这样做,除非你打算做一些其他的事情,比如在管道中的月和年;
{
aggregate: 'posts',
pipeline: [
{$match: { created_date: { $gte: ISODate('2012-09-01'), $lt: ISODate('2012-10-01') } } },
{$project:
{
month : {$month:'$created_date'},
year : {$year:'$created_date'}
}
},
{$match:
{
month:9,
year: 2012
}
},
{$group:
{_id: '0',
totalComments:{$sum:'$comments'}
}
}
]
}https://stackoverflow.com/questions/12694490
复制相似问题