文章/答案/技术大牛

发布

问MongoDB聚合/map-减少
EN

Stack Overflow用户

提问于 2016-03-01 13:54:33

回答 3查看 699关注 0票数 2

我是MongoDB新手，我需要做一个聚合，这在我看来是相当困难的。一份文件看起来像这样

{ 
 "_id" : ObjectId("568192aef8bd6b0cd0f649c6"), 
 "conference" : "IEEE International Conference on Acoustics, Speech and Signal Processing", 
 "prism:aggregationType" : "Conference Proceeding", 
 "children-id" : [
    "SCOPUS_ID:84948148564", 
    "SCOPUS_ID:84927603733", 
    "SCOPUS_ID:84943521758", 
    "SCOPUS_ID:84905234683", 
    "SCOPUS_ID:84876113709"
 ], 
 "dc:identifier" : "SCOPUS_ID:84867598678"
}

该示例只包含我在聚合中需要的字段。Prism:aggregationType可以有5种不同的价值观(会议进程、书籍、期刊等)。Children- ID 表示，该文档由一组其他文档引用(SCOPUS_ID是每个文档的唯一ID)。我想要做的是先按prism:aggregationType 分组，然后对每个会议进行分组，我想知道每个引用文档的数量($gt> 0)。

例如，让我们说，有100个文件，有会议从上面。这100份文件被250份文件引用。我想从所有这250个文件中知道有多少有“棱镜:聚合类型”：“会议进程”，“棱镜:聚合类型”：“日志”等等。输出可以如下所示：

{  
 "conference" : "IEEE International Conference on Acoustics, Speech and Signal Processing", 
 "aggregationTypes" : [{"Conference Proceeding" : 50} , {"Journal" : 200}]
}

如果使用聚合管道或map-还原来完成，这并不重要。

编辑

是否有任何方法将这2合并成一个聚合：

db.articles.aggregate([
 { $match:{
    conference : {$ne : null}
 }},
 {$unwind:'$children-id'},
 {$group: {
   _id: {conference: '$conference'},
  'cited-by':{$push:{'dc:identifier':"$children-id"}}
 }}
 ]);
db.articles.find( { 'dc:identifier': { $in: [ 'SCOPUS_ID:84943302953', 'SCOPUS_ID:84927603733'] } }, {'prism:aggregationType':1} );

在查询中，我希望将$in中的数组替换为$push创建的数组。

aggregation-framework

mongodb

mapreduce

回答 3

Stack Overflow用户

回答已采纳

发布于 2016-03-01 21:46:35

我在编辑部分中编写的代码也是我得出的最终结果(稍微修改了一下)。

db.articles.aggregate([
{ $match:{
  conference : {$ne : null}
}},
{$unwind:'$children-id'},
{$group: {
  _id: '$conference',
 'cited-by':{$push:"$children-id"}
}}
]);
db.articles.find( { 'dc:identifier': { $in: [ 'SCOPUS_ID:84943302953', 'SCOPUS_ID:84927603733'] } }, {'prism:aggregationType':1} );

每次会议的结果都是这样的：

{ 
"_id" : "Annual Conference on Privacy, Security and Trust", 
"cited-by" : [
    "SCOPUS_ID:84942789431", 
    "SCOPUS_ID:84928151617", 
    "SCOPUS_ID:84939229259", 
    "SCOPUS_ID:84946407175", 
    "SCOPUS_ID:84933039513", 
    "SCOPUS_ID:84942789431", 
    "SCOPUS_ID:84942607254", 
    "SCOPUS_ID:84948165954", 
    "SCOPUS_ID:84926379258", 
    "SCOPUS_ID:84946771354", 
    "SCOPUS_ID:84944223683", 
    "SCOPUS_ID:84942789431", 
    "SCOPUS_ID:84939169499", 
    "SCOPUS_ID:84947104346", 
    "SCOPUS_ID:84948764343", 
    "SCOPUS_ID:84938075139", 
    "SCOPUS_ID:84946196118", 
    "SCOPUS_ID:84930820238", 
    "SCOPUS_ID:84947785321", 
    "SCOPUS_ID:84933496680", 
    "SCOPUS_ID:84942789431"
]
}

我遍历我得到的所有文档(大约250个)，然后在$in中使用引用的-by数组。我使用索引对dc:标识符，所以它立即工作。$lookup可能是从聚合管道中完成任务的替代方法，但是R中的包不支持2.6以上的版本。无论如何，谢谢你抽出时间:)

票数 0

Stack Overflow用户

发布于 2016-03-01 15:00:13

请通过aggregation试试这个

> db.collections
    .aggregate([
       // 1. get the size of `children-id` array through $project
       {$project: {
             conference: 1, 
             IEEE1: 1, 
             'prism:aggregationType': 1, 
             'children-id': {$size: '$children-id'}
        }},
        // 2. group by `conference` and `prism:aggregationType` and sum the size of `children-id` 
        {$group: {
                 _id: {
                    conference:'$conference', 
                    aggregationType: '$prism:aggregationType'
                    }, 
                 ids: {$sum: '$children-id'}
         }}, 
         // 3. group by `conference`, and make pair of the conference processing ids size and journal ids size 
         {$group: {
               _id: '$_id.conference', 
               aggregationTypes: { 
                           $cond: [{$eq: ['$_id.aggregationType', 'Conference Proceeding']}, 
                                   {$push: {"Conference Proceeding": '$ids'}}, 
                                   {$push: {"Journal": '$ids'}}
                           ]}
         }}
]);

票数 0

Stack Overflow用户

发布于 2016-03-01 16:20:13

我们聊天的时候，

不幸的是，在聚合管道中使用$lookup绑定到MongoDB3.2，这不是一种情况，因为R驱动程序可以使用mongo2.6，而且源文档包含在多个集合中。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/35724817

复制

相似问题

问MongoDB聚合/map-减少
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问MongoDB聚合/map-减少EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问MongoDB聚合/map-减少
EN