我使用的是MongoDB和PyMongo,数据结构如下。
[
{
"position": 367,
"entropy": 0.1327801096975522,
"variants_flattened": [
"GFRHQNSEG",
"GFRHQNSEG",
"GFRHQNSEG",
"GFRHQNAEG"
],
"supports": 51,
"sequences": [
{
"position": 367,
"sequence": "GFRHQNSEG",
"count": 50,
"conservation": 98.03921568627452,
"motif_short": "I",
"motif_long": "Index",
"id": [
"APQ31289.1",
"ASU55526.1",
"ASU55528.1",
"APQ31291.1"
],
"strain": [
"Influenza A virus A/Xiamen/s200/2016",
"Influenza A virus A/Shandong-Zhifu/164/2016",
"Influenza A virus A/Shandong-Zhifu/1185/2016",
"Influenza A virus A/Xiamen/s228/2016"
],
"country": [
"HA Hemagglutinin",
"HA Hemagglutinin",
"HA Hemagglutinin",
"HA Hemagglutinin"
],
"host": [
"Influenza A virus A/Xiamen/s200/2016",
"Influenza A virus A/Shandong-Zhifu/164/2016",
"Influenza A virus A/Shandong-Zhifu/1185/2016",
"Influenza A virus A/Xiamen/s228/2016"
]
},
{
"position": 367,
"sequence": "GFRHQNAEG",
"count": 1,
"conservation": 1.9607843137254902,
"motif_short": "Ma",
"motif_long": "Major",
"id": [
"QBM69728.1"
],
"strain": [
"Influenza A virus A/China/70793/2016"
],
"country": [
"HA Hemagglutinin"
],
"host": [
"Influenza A virus A/China/70793/2016"
]
}
],
"variants": 2
}
]根级别列表包含多个结构相似的对象。
我需要的是获取"motif_short“等于"I”的实例(仅来自"sequences“列表中的特定对象)。
预期输出为(在此特定示例中,只有一个输出对象,但在单个实例中可以有多个符合此条件的对象):
{
"position": 367,
"sequence": "GFRHQNSEG",
"count": 50,
"conservation": 98.03921568627452,
"motif_short": "I",
"motif_long": "Index",
"id": [
"APQ31289.1",
"ASU55526.1",
"ASU55528.1",
"APQ31291.1"
],
"strain": [
"Influenza A virus A/Xiamen/s200/2016",
"Influenza A virus A/Shandong-Zhifu/164/2016",
"Influenza A virus A/Shandong-Zhifu/1185/2016",
"Influenza A virus A/Xiamen/s228/2016"
],
"country": [
"HA Hemagglutinin",
"HA Hemagglutinin",
"HA Hemagglutinin",
"HA Hemagglutinin"
],
"host": [
"Influenza A virus A/Xiamen/s200/2016",
"Influenza A virus A/Shandong-Zhifu/164/2016",
"Influenza A virus A/Shandong-Zhifu/1185/2016",
"Influenza A virus A/Xiamen/s228/2016"
]
}我是MongoDB的新手,已经尝试过一些选项,比如聚合,但这正是我开始的地方。请帮帮我。
提前感谢!
发布于 2020-10-10 00:55:57
您可以使用聚合$project和$filter来解决此问题。请尝试使用以下脚本来解决此特定问题:
#if col is our collection object in pymongo
result = col.aggregate([{'$project': {'sequences': { '$filter': { 'input': '$sequences', 'as': 's', 'cond': { '$eq': ['$$s.motif_short', 'I'] } } } }}])此查询在motif_short等于"I“的序列和过滤器上进行投影。你会得到类似这样的结果:
{
"_id":"xyz",
"sequences":[
{
"position":367,
"sequence":"GFRHQNSEG",
"count":50,
"conservation":98.03921568627452,
"motif_short":"I",
"motif_long":"Index",
"id":[
"APQ31289.1",
"ASU55526.1",
"ASU55528.1",
"APQ31291.1"
],
"strain":[
"Influenza A virus A/Xiamen/s200/2016",
"Influenza A virus A/Shandong-Zhifu/164/2016",
"Influenza A virus A/Shandong-Zhifu/1185/2016",
"Influenza A virus A/Xiamen/s228/2016"
],
"country":[
"HA Hemagglutinin",
"HA Hemagglutinin",
"HA Hemagglutinin",
"HA Hemagglutinin"
],
"host":[
"Influenza A virus A/Xiamen/s200/2016",
"Influenza A virus A/Shandong-Zhifu/164/2016",
"Influenza A virus A/Shandong-Zhifu/1185/2016",
"Influenza A virus A/Xiamen/s228/2016"
]
}
]
}https://stackoverflow.com/questions/64279691
复制相似问题