首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从熊猫数据中的所有元组列表中逐行提取第一个元素。

从熊猫数据中的所有元组列表中逐行提取第一个元素。
EN

Stack Overflow用户
提问于 2022-04-01 15:59:39
回答 1查看 101关注 0票数 0

我有一个dataframes列表,所有dataframes中的第九列是一个元组列表。我试图从这个元组列表中提取第一个元素。如果在删除所有元组列表中的第二个元素后获得的元素/元素的数量小于5个,我希望删除该行。

但是,目前我无法只得到列表中所有元组中的第一个元素,我查看了堆栈溢出上的各种响应,但没有解决方案。

我在下面显示了部分数据,我尝试过的代码如下所示,

代码语言:javascript
复制
>>> type(motifs[0])
<class 'pandas.core.frame.DataFrame'>
>>> len(motifs)
100

>>> motifs[0]
                                                   Enrichment  ...                                                   
                                                          AUC  ...                                        TargetGenes
TF     MotifID                                                 ...                                                   
Arid3a tfdimers__MD00454                             0.074115  ...  [(Hmgb1, 1.1106060045583808), (Slc44a2, 0.4323...
Atf1   dbcorrdb__JUND__ENCSR000EGN_1__m1             0.079926  ...  [(Coq8b, 0.4451942964830318), (Tagln2, 0.56984...
Atf3   taipale_cyt_meth__JDP2_NRTGAYGTCAYN_FL_meth   0.058592  ...  [(Map1lc3a, 3.488720958149637), (Ccl4, 0.55845...
       taipale_cyt_meth__XBP1_NRTGACGTCAYN_FL        0.059979  ...  [(Map1lc3a, 3.488720958149637), (Dusp1, 0.5584...
       dbcorrdb__JUND__ENCSR000EGN_1__m1             0.059945  ...  [(Kdm6b, 3.488720958149637), (Junb, 0.55845389...
...                                                       ...  ...                                                ...
Zmiz1  dbcorrdb__POLR2A__ENCSR000BMR_1__m1           0.084186  ...  [(Egr1, 0.2689079225312428), (Sumo1, 0.2982820...
       dbcorrdb__HCFC1__ENCSR000ECH_1__m3            0.088241  ...  [(Egr1, 0.2689079225312428), (Sumo1, 0.2982820...
       dbcorrdb__GABPA__ENCSR000BIW_1__m1            0.082741  ...  [(Egr1, 0.2689079225312428), (Vps52, 0.2982820...
       dbcorrdb__GABPA__ENCSR000BLO_1__m1            0.081011  ...  [(Vps52, 0.2689079225312428), (Egr1, 0.2982820...
       dbcorrdb__POLR2A__ENCSR000EAY_1__m1           0.083258  ...  [(Sumo1, 0.2689079225312428), (Leprotl1, 0.298...

[15263 rows x 8 columns]
>>> motifs[1]
                                             Enrichment  ...                                                   
                                                    AUC  ...                                        TargetGenes
TF       MotifID                                         ...                                                   
AU041133 transfac_pro__M06033                  0.061555  ...  [(Topors, 0.9542964293512636), (Tm9sf3, 0.8081...
Arid3a   tfdimers__MD00454                     0.055638  ...  [(Hmgb1, 1.0336516736519146), (Zfp771, 1.24306...
Atf1     tfdimers__MD00439                     0.078748  ...  [(Mef2c, 0.4349350423233438), (Hcfc1, 1.0), (M...
Atf3     dbcorrdb__JUN__ENCSR000EGH_1__m1      0.065025  ...  [(Smox, 0.7721842224335954), (Junb, 3.41419581...
         dbcorrdb__JUND__ENCSR000EGN_1__m1     0.074146  ...  [(Kdm6b, 0.7721842224335954), (Smox, 3.4141958...
...                                                 ...  ...                                                ...
Zmiz1    dbcorrdb__POLR2A__ENCSR000BMR_1__m1   0.085257  ...  [(Egr1, 0.000962868898130634), (Sumo1, 0.39039...
         dbcorrdb__HCFC1__ENCSR000ECH_1__m3    0.093355  ...  [(Lypla2, 0.000962868898130634), (Egr1, 0.3903...
         dbcorrdb__GABPA__ENCSR000BIW_1__m1    0.089414  ...  [(Egr1, 0.000962868898130634), (Vps52, 0.39039...
         dbcorrdb__GABPA__ENCSR000BLO_1__m1    0.085608  ...  [(Lypla2, 0.000962868898130634), (Mon1b, 0.390...
         dbcorrdb__POLR2A__ENCSR000EAY_1__m1   0.078761  ...  [(Sumo1, 0.000962868898130634), (Lypla2, 0.390...

[15442 rows x 8 columns]
# removing multi_index for the list of dataframes
>>> [df.reset_index(inplace=True) for df in motifs]
[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]
# The head of the columns containing list of tuples
>>> motifs[1][('Enrichment',           'TargetGenes')].head()
0    [(Topors, 0.9542964293512636), (Tm9sf3, 0.8081...
1    [(Hmgb1, 1.0336516736519146), (Zfp771, 1.24306...
2    [(Mef2c, 0.4349350423233438), (Hcfc1, 1.0), (M...
3    [(Smox, 0.7721842224335954), (Junb, 3.41419581...
4    [(Kdm6b, 0.7721842224335954), (Smox, 3.4141958...
Name: (Enrichment, TargetGenes), dtype: object
# If I try to geth the first element from the list of tuples, using the code below then it again gives the first tuple as given below,
>>> motifs[1][('Enrichment',           'TargetGenes')] = [ seq[0] for seq in motifs[1][('Enrichment',           'TargetGenes')] ]
>>> motifs[1][('Enrichment',           'TargetGenes')].head()
0    (Topors, 0.9542964293512636)
1     (Hmgb1, 1.0336516736519146)
2     (Mef2c, 0.4349350423233438)
3      (Smox, 0.7721842224335954)
4     (Kdm6b, 0.7721842224335954)
Name: (Enrichment, TargetGenes), dtype: object


# If I try another method using the same column for fifth dataframe then I get the following result as given below,

    >>> motifs[5][('Enrichment',           'TargetGenes')] = [(tup[0],) for tup in motifs[5][('Enrichment',           'TargetGenes')] ]
>>> motifs[5][('Enrichment',           'TargetGenes')].head()
0    ((Tagln2, 2.9989559716809815),)
1     ((Kdm6b, 2.9989559716809815),)
2     ((Kdm6b, 2.9989559716809815),)
3      ((Junb, 2.9989559716809815),)
4     ((Kdm6b, 2.9989559716809815),)
Name: (Enrichment, TargetGenes), dtype: object
>>> 

所需的输出如下,

代码语言:javascript
复制
>>> motifs[5][('Enrichment',           'TargetGenes')].head()
0    ['Slc39a9', 'Arpc2', 'Arpc2', 'Arpc2', 'Phrf1']
1    ['Slc39a9', 'Arpc2', 'Arpc2', 'Slc39a9', 'Arpc2', 'Arpc2', 'Arpc2', 'Phrf1', 'Pafah1b1', 'Arpc2']
2    ['Supt16', 'Polr2m', 'Zfp668', 'Abl1', 'Thap1', 'Tia1', 'Cenpl'] 

因此,是否有可能从所有数据格式中名为TargetGenes的列中从元组列表中提取第一个元素的列表,就像我在期望的输出中显示的那样?

更新1

我为少数数据提供了df.head(5).to_dict()的输出,

代码语言:javascript
复制
>>> motifs[9].head(5).to_dict()
{('TF', ''): {0: 'Arid3a', 1: 'Arnt', 2: 'Arnt', 3: 'Arnt', 4: 'Arnt'}, ('MotifID', ''): {0: 'tfdimers__MD00454', 1: 'taipale_cyt_meth__SREBF1_NTCACGTGAN_eDBD', 2: 'cisbp__M4597', 3: 'hocomoco__ATF3_HUMAN.H11MO.0.A', 4: 'cisbp__M4552'}, ('Enrichment', 'AUC'): {0: 0.06471430725162068, 1: 0.06095155535454042, 2: 0.07011658877330519, 3: 0.06705738981858385, 4: 0.06247801397055128}, ('Enrichment', 'Annotation'): {0: 'motif is annotated for orthologous gene ENSG00000116017 in H. sapiens (identity = 80%)', 1: "motif similar to transfac_public__M00539 ('V$ARNT_02: Arnt'; q-value = 3.13e-05) which is directly annotated", 2: "gene is annotated for similar motif transfac_public__M00539 ('V$ARNT_02: Arnt'; q-value = 0.000799)", 3: "gene is annotated for similar motif transfac_public__M00539 ('V$ARNT_02: Arnt'; q-value = 0.000575)", 4: "gene is annotated for similar motif transfac_public__M00539 ('V$ARNT_02: Arnt'; q-value = 0.000358)"}, ('Enrichment', 'Context'): {0: frozenset({'weight>75.0%', 'activating', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 1: frozenset({'weight>75.0%', 'activating', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 2: frozenset({'weight>75.0%', 'activating', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 3: frozenset({'weight>75.0%', 'activating', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 4: frozenset({'weight>75.0%', 'activating', 'mm10__refseq-r80__10kb_up_and_down_tss'})}, ('Enrichment', 'MotifSimilarityQvalue'): {0: 0.0, 1: 3.1e-05, 2: 0.000799, 3: 0.000575, 4: 0.00035800000000000003}, ('Enrichment', 'NES'): {0: 3.326402558504723, 1: 3.1209030910033024, 2: 3.922071066278296, 3: 3.654648993653949, 4: 3.2543395666659647}, ('Enrichment', 'OrthologousIdentity'): {0: 0.8094439999999999, 1: 1.0, 2: 1.0, 3: 1.0, 4: 1.0}, ('Enrichment', 'RankAtMax'): {0: 1185, 1: 298, 2: 901, 3: 865, 4: 4637}, ('Enrichment', 'TargetGenes'): {0: [('Hmgb1', 0.745314226221018), ('Zfp771', 0.6764829824966149), ('Irgc1', 1.9951670755270587), ('Bcl11a', 0.4856052689262107), ('Sh3kbp1', 0.5933072140052049), ('Traf3', 2.7600863350248512), ('Mars', 0.4505749371997108), ('Slc6a6', 1.0), ('Mlec', 0.39775865366894697), ('Rps6kb1', 0.40770958455266104), ('Slc12a4', 0.8671975714781245), ('Clic4', 0.7094675790094807), ('Lat2', 0.40522588119023456), ('Mcl1', 0.4268571683991914), ('Ptprj', 0.9892910773852126), ('Med27', 0.3965364187198045), ('Eif3a', 0.5472475711288725)], 1: [('Clcn6', 0.5838135470801639), ('Ptprs', 2.580731143355787), ('Erp29', 0.4427625162377926), ('Lin52', 0.4446103752969262), ('Smndc1', 0.5501206802490346), ('Scarb1', 1.038675980787723), ('Rnf146', 0.8398798839169821)], 2: [('Ptprs', 0.5838135470801639), ('Clcn6', 2.580731143355787), ('Pde7a', 0.4427625162377926), ('Smndc1', 0.4446103752969262), ('Ppp2r2a', 0.5501206802490346), ('Gzf1', 1.038675980787723), ('Paf1', 0.8398798839169821), ('Erp29', 0.9122832235342808), ('Ywhah', 1.0), ('Lin52', 0.6065115546339283), ('Atg10', 0.7179666115646837), ('Rnf146', 0.4719188766630129), ('Hlx', 0.4350102779899021), ('Mafk', 0.7611670711498808), ('Atg5', 1.5656437019255856)], 3: [('Ptprs', 0.5838135470801639), ('Clcn6', 2.580731143355787), ('Pde7a', 0.4427625162377926), ('Smndc1', 0.4446103752969262), ('Gzf1', 0.5501206802490346), ('Atg10', 1.038675980787723), ('Erp29', 0.8398798839169821), ('Paf1', 0.9122832235342808), ('Mff', 1.0), ('Ppp2r2a', 0.6065115546339283), ('Atg5', 0.7179666115646837), ('Rab1a', 0.4719188766630129), ('Rnf146', 0.4350102779899021), ('Mafk', 0.7611670711498808), ('Lin52', 1.5656437019255856), ('Hlx', 0.5914337023692341)], 4: [('Clcn6', 0.5838135470801639), ('Ptprs', 2.580731143355787), ('Lin52', 0.4427625162377926), ('Erp29', 0.4446103752969262), ('Smndc1', 0.5501206802490346), ('Rnf146', 1.038675980787723), ('Mff', 0.8398798839169821), ('Pde7a', 0.9122832235342808), ('Atg5', 1.0), ('Atg10', 0.6065115546339283), ('Hlx', 0.7179666115646837), ('Mlx', 0.4719188766630129), ('Ppp2r2a', 0.4350102779899021), ('Atp1a1', 0.7611670711498808), ('Mcmbp', 1.5656437019255856), ('Paf1', 0.5914337023692341), ('Mafk', 1.8757251159707784), ('Ywhah', 0.4148168160950648), ('Ykt6', 0.8740363421300391), ('Gzf1', 1.6749018097542459), ('Itpr1', 0.6244407603393514), ('Sec24c', 0.8125260569274086), ('Atp1b1', 1.3433579468658023), ('Cracr2a', 1.9825295293378795), ('Rabl6', 1.6060242452401532), ('Glo1', 4.075255658782804), ('Kat7', 2.1993521341931785), ('Mxd4', 1.546869996844828), ('Rab1a', 4.052034183647333), ('Taok3', 1.4156879591756044), ('Lonp2', 3.866232617909616), ('Bmp2k', 0.5805201605958586), ('Kcnn4', 0.7230752540573253), ('Nrip1', 0.4565406766743578), ('Hexb', 0.8850971245380614), ('Slc31a1', 5.410182658990805), ('Oat', 2.4192511357615585)]}}
>>> motifs[10].head(5).to_dict()
/*
* 提示:该行代码过长,系统自动注释不进行高亮。一键复制会移除系统注释 
* {('TF', ''): {0: 'Atf3', 1: 'Atf3', 2: 'Atf3', 3: 'Atf3', 4: 'Atf3'}, ('MotifID', ''): {0: 'dbcorrdb__JUN__ENCSR000EGH_1__m1', 1: 'dbcorrdb__JUND__ENCSR000EGN_1__m1', 2: 'cisbp__M5050', 3: 'dbcorrdb__eGFP-JUNB__ENCSR000DJY_1__m1', 4: 'dbcorrdb__FOSL1__ENCSR000BMV_1__m1'}, ('Enrichment', 'AUC'): {0: 0.06847185815248727, 1: 0.07298037887028418, 2: 0.05903279302412667, 3: 0.07423158995940253, 4: 0.07630307245136325}, ('Enrichment', 'Annotation'): {0: "gene is annotated for similar motif hocomoco__ATF3_MOUSE.H11MO.0.A ('ATF3_MOUSE'; q-value = 0.000773)", 1: "gene is orthologous to ENSG00000162772 in H. sapiens (identity = 95%) which is annotated for similar motif homer__DATGASTCATHN_Atf3 ('Atf3(bZIP)/GBM-ATF3-ChIP-Seq(GSE33912)/Homer'; q-value = 4.47e-05)", 2: "gene is annotated for similar motif hocomoco__ATF3_MOUSE.H11MO.0.A ('ATF3_MOUSE'; q-value = 0.000608)", 3: "gene is annotated for similar motif hocomoco__ATF3_MOUSE.H11MO.0.A ('ATF3_MOUSE'; q-value = 6.26e-06)", 4: "gene is annotated for similar motif hocomoco__ATF3_MOUSE.H11MO.0.A ('ATF3_MOUSE'; q-value = 3.66e-06)"}, ('Enrichment', 'Context'): {0: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 1: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 2: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 3: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 4: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'})}, ('Enrichment', 'MotifSimilarityQvalue'): {0: 0.000773, 1: 4.5e-05, 2: 0.000608, 3: 6e-06, 4: 4e-06}, ('Enrichment', 'NES'): {0: 4.024298594227186, 1: 4.467018476489827, 2: 3.0974176805382267, 3: 4.589882728587765, 4: 4.793294566384112}, ('Enrichment', 'OrthologousIdentity'): {0: 1.0, 1: 0.950276, 2: 1.0, 3: 1.0, 4: 1.0}, ('Enrichment', 'RankAtMax'): {0: 481, 1: 1112, 2: 829, 3: 634, 4: 762}, ('Enrichment', 'TargetGenes'): {0: [('Tagln2', 1.6868254779790988), ('Junb', 2.131165507779861), ('Pim1', 0.5626962771519949), ('Mir155hg', 4.215511908233003), ('Kdm6b', 1.3831692473783712), ('Vcpip1', 0.4655884981482655), ('Ptp4a2', 0.4608012609224432), ('Lgals3', 1.2986893071734795), ('Dusp1', 5.525777129178691), ('Akt3', 2.302534028919806), ('Isg20', 1.565796075237834), ('Sec11c', 2.5799875669226298), ('Gpx1', 0.7797457421907137), ('Pmepa1', 1.0), ('Diaph2', 0.4567503652363437), ('Gadd45b', 0.4041840201626749), ('Traf1', 2.0641638640138207), ('Tnfaip8', 0.4166028876535105), ('Fam110a', 0.5565365664603831), ('Smim3', 4.4918400769026645)], 1: [('Kdm6b', 1.6868254779790988), ('Junb', 2.131165507779861), ('Tagln2', 0.5626962771519949), ('Dusp1', 4.215511908233003), ('Mir155hg', 1.3831692473783712), ('Sec11c', 0.4655884981482655), ('Ccnd2', 0.4608012609224432), ('Lgals3', 1.2986893071734795), ('Bach1', 5.525777129178691), ('Vcpip1', 2.302534028919806), ('Pim1', 1.565796075237834), ('Cdkn1a', 2.5799875669226298), ('Gadd45b', 0.7797457421907137), ('Akt3', 1.0), ('Diaph2', 0.4567503652363437), ('Zfp710', 0.4041840201626749), ('Ncoa3', 2.0641638640138207), ('Ptp4a2', 0.4166028876535105), ('Atf3', 0.5565365664603831), ('Traf1', 4.4918400769026645), ('Pkib', 0.6208941839583779), ('Isg20', 7.928134177072506), ('Abr', 21.31142622147593), ('Tnfaip8', 6.271477001021822), ('Ccr9', 1.7224099621172309), ('Klf6', 2.934167135195324), ('Cdc42ep4', 0.5109519744748661), ('Ncf2', 18.859900155945674), ('Psap', 0.7982368206818751), ('Txndc5', 24.13078778816305), ('Rps6ka1', 9.17079179660625), ('Sipa1l1', 2.302124705475682), ('Smim3', 6.291659684538216), ('Tgif1', 3.5504062994628045)], 2: [('Junb', 1.6868254779790988), ('Oser1', 2.131165507779861), ('Tagln2', 0.5626962771519949), ('Lgals3', 4.215511908233003), ('Bach1', 1.3831692473783712), ('Csrnp1', 0.4655884981482655), ('Kdm6b', 0.4608012609224432), ('Vcpip1', 1.2986893071734795), ('Gpx1', 5.525777129178691), ('Akt3', 2.302534028919806), ('Pim1', 1.565796075237834), ('Cdkn1a', 2.5799875669226298), ('Prnp', 0.7797457421907137), ('Klf6', 1.0), ('Ptp4a2', 0.4567503652363437), ('Rab8b', 0.4041840201626749), ('Pfn1', 2.0641638640138207), ('Mir155hg', 0.4166028876535105), ('Pmepa1', 0.5565365664603831), ('Dusp1', 4.4918400769026645), ('Abr', 0.6208941839583779), ('Fyb', 7.928134177072506), ('Tgif1', 21.31142622147593), ('Isg20', 6.271477001021822)], 3: [('Kdm6b', 1.6868254779790988), ('Junb', 2.131165507779861), ('Ptp4a2', 0.5626962771519949), ('Sec11c', 4.215511908233003), ('Lgals3', 1.3831692473783712), ('Pim1', 0.4655884981482655), ('Tagln2', 0.4608012609224432), ('Diaph2', 1.2986893071734795), ('Vcpip1', 5.525777129178691), ('Akt3', 2.302534028919806), ('Cdkn1a', 1.565796075237834), ('Mir155hg', 2.5799875669226298), ('Isg20', 0.7797457421907137), ('Gpx1', 1.0), ('Bach1', 0.4567503652363437), ('Txndc5', 0.4041840201626749), ('Ncf2', 2.0641638640138207), ('Dusp1', 0.4166028876535105), ('Pmepa1', 0.5565365664603831), ('Oser1', 4.4918400769026645), ('Fam110a', 0.6208941839583779), ('Rps6ka1', 7.928134177072506), ('Klf6', 21.31142622147593), ('Zfp710', 6.271477001021822), ('Bhlhe40', 1.7224099621172309), ('Tgif1', 2.934167135195324)], 4: [('Junb', 1.6868254779790988), ('Ptp4a2', 2.131165507779861), ('Pim1', 0.5626962771519949), ('Kdm6b', 4.215511908233003), ('Sec11c', 1.3831692473783712), ('Vcpip1', 0.4655884981482655), ('Diaph2', 0.4608012609224432), ('Mir155hg', 1.2986893071734795), ('Lgals3', 5.525777129178691), ('Bach1', 2.302534028919806), ('Akt3', 1.565796075237834), ('Tagln2', 2.5799875669226298), ('Isg20', 0.7797457421907137), ('Cdkn1a', 1.0), ('Bhlhe40', 0.4567503652363437), ('Gadd45b', 0.4041840201626749), ('Pmepa1', 2.0641638640138207), ('Gpx1', 0.4166028876535105), ('Txndc5', 0.5565365664603831), ('Ncf2', 4.4918400769026645), ('Csrnp1', 0.6208941839583779), ('Sipa1l1', 7.928134177072506), ('Klf6', 21.31142622147593), ('Zfp710', 6.271477001021822), ('Fam110a', 1.7224099621172309), ('Atf3', 2.934167135195324), ('Smim3', 0.5109519744748661), ('Ncoa3', 18.859900155945674)]}}
*/
>>> motifs[11].head(5).to_dict()
{('TF', ''): {0: 'Arid3a', 1: 'Arid3a', 2: 'Arid3a', 3: 'Arnt', 4: 'Arnt'}, ('MotifID', ''): {0: 'cisbp__M1879', 1: 'swissregulon__hs__FOXA2.p3', 2: 'homer__AAAGTAAACA_FOXA1_GSE26831', 3: 'cisbp__M5633', 4: 'cisbp__M5866'}, ('Enrichment', 'AUC'): {0: 0.0668223211428239, 1: 0.06646591603386576, 2: 0.06737511274039161, 3: 0.06968363311646894, 4: 0.06836969001148106}, ('Enrichment', 'Annotation'): {0: "gene is orthologous to ENSG00000116017 in H. sapiens (identity = 79%) which is annotated for similar motif dbcorrdb__ARID3A__ENCSR000EDP_1__m1 ('ARID3A (ENCSR000EDP-1, motif 1)'; q-value = 0.00023)", 1: "gene is orthologous to ENSG00000116017 in H. sapiens (identity = 79%) which is annotated for similar motif dbcorrdb__ARID3A__ENCSR000EDP_1__m1 ('ARID3A (ENCSR000EDP-1, motif 1)'; q-value = 0.00023)", 2: "motif similar to dbcorrdb__ARID3A__ENCSR000EDP_1__m1 ('ARID3A (ENCSR000EDP-1, motif 1)'; q-value = 8.23e-06) which is annotated for orthologous gene ENSG00000116017 in H. sapiens (identity = 80%)", 3: "gene is annotated for similar motif transfac_public__M00539 ('V$ARNT_02: Arnt'; q-value = 0.000417)", 4: "gene is annotated for similar motif transfac_public__M00539 ('V$ARNT_02: Arnt'; q-value = 0.000133)"}, ('Enrichment', 'Context'): {0: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 1: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 2: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 3: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 4: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'})}, ('Enrichment', 'MotifSimilarityQvalue'): {0: 0.00023, 1: 0.00023, 2: 8e-06, 3: 0.000417, 4: 0.000133}, ('Enrichment', 'NES'): {0: 3.048011522275345, 1: 3.020345923942252, 2: 3.090921429894018, 3: 3.726068046226016, 4: 3.6125945303641087}, ('Enrichment', 'OrthologousIdentity'): {0: 0.798669, 1: 0.798669, 2: 0.8094439999999999, 3: 1.0, 4: 1.0}, ('Enrichment', 'RankAtMax'): {0: 414, 1: 352, 2: 398, 3: 945, 4: 499}, ('Enrichment', 'TargetGenes'): {0: [('Arid3a', 1.0814455429211889), ('Pogz', 0.6244276987659271), ('Ago4', 0.9664526956346918), ('Taf1b', 0.44722261016464504), ('Itpr1', 0.8313950646937135), ('Hmgb1', 1.9945139689034008), ('Sh3kbp1', 1.0), ('Cd180', 0.6042623259696077)], 1: [('Arid3a', 1.0814455429211889), ('Pogz', 0.6244276987659271), ('Ago4', 0.9664526956346918), ('Taf1b', 0.44722261016464504), ('Itpr1', 0.8313950646937135), ('Hmgb1', 1.9945139689034008), ('Sh3kbp1', 1.0), ('Cd180', 0.6042623259696077)], 2: [('Pogz', 1.0814455429211889), ('Itpr1', 0.6244276987659271), ('Arid3a', 0.9664526956346918), ('Sh3kbp1', 0.44722261016464504), ('Hmgb1', 0.8313950646937135), ('Clpx', 1.9945139689034008), ('Med27', 1.0), ('Fgfr1op2', 0.6042623259696077)], 3: [('Clcn6', 0.8095717882553205), ('Kansl1', 1.5834902996396047), ('Lin52', 1.7464790457683428), ('Ptprs', 1.18907063271503), ('Asna1', 0.4109458104482189), ('Ccdc91', 1.064875281051844), ('Erp29', 1.2944573975829907), ('Zfas1', 1.0), ('Rnf146', 0.5426386495200634), ('Smndc1', 1.3368937988306546), ('Pabpc1', 1.4072285212815487), ('Cracr2a', 2.3364193374078432), ('Mafk', 0.6752603264576597), ('Mcmbp', 0.3974384266129632)], 4: [('Clcn6', 0.8095717882553205), ('Lin52', 1.5834902996396047), ('Ptprs', 1.7464790457683428), ('Kansl1', 1.18907063271503), ('Asna1', 0.4109458104482189), ('Smndc1', 1.064875281051844), ('Rnf146', 1.2944573975829907), ('Dnajc13', 1.0), ('Ccdc91', 0.5426386495200634), ('Erp29', 1.3368937988306546)]}}
>>>

谢谢,

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-04-03 17:16:16

IIUC,你可以理解一下:

代码语言:javascript
复制
for df in motifs:
    df['first_elements'] = df.iloc[:, 9].apply(lambda li: [x[0] for x in li])

输出:

代码语言:javascript
复制
[       TF                                   MotifID Enrichment  \
                                                           AUC   
0  Arid3a                         tfdimers__MD00454   0.064714   
1    Arnt  taipale_cyt_meth__SREBF1_NTCACGTGAN_eDBD   0.060952   
2    Arnt                              cisbp__M4597   0.070117   
3    Arnt            hocomoco__ATF3_HUMAN.H11MO.0.A   0.067057   
4    Arnt                              cisbp__M4552   0.062478   

                                                      \
                                          Annotation   
0  motif is annotated for orthologous gene ENSG00...   
1  motif similar to transfac_public__M00539 ('V$A...   
2  gene is annotated for similar motif transfac_p...   
3  gene is annotated for similar motif transfac_p...   
4  gene is annotated for similar motif transfac_p...   

                                                                            \
                                             Context MotifSimilarityQvalue   
0  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000000   
1  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000031   
2  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000799   
3  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000575   
4  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000358   

                                           \
        NES OrthologousIdentity RankAtMax   
0  3.326403            0.809444      1185   
1  3.120903            1.000000       298   
2  3.922071            1.000000       901   
3  3.654649            1.000000       865   
4  3.254340            1.000000      4637   

                                                      \
                                         TargetGenes   
0  [(Hmgb1, 0.745314226221018), (Zfp771, 0.676482...   
1  [(Clcn6, 0.5838135470801639), (Ptprs, 2.580731...   
2  [(Ptprs, 0.5838135470801639), (Clcn6, 2.580731...   
3  [(Ptprs, 0.5838135470801639), (Clcn6, 2.580731...   
4  [(Clcn6, 0.5838135470801639), (Ptprs, 2.580731...   

                                      first_elements  
                                                      
0  [Hmgb1, Zfp771, Irgc1, Bcl11a, Sh3kbp1, Traf3,...  
1  [Clcn6, Ptprs, Erp29, Lin52, Smndc1, Scarb1, R...  
2  [Ptprs, Clcn6, Pde7a, Smndc1, Ppp2r2a, Gzf1, P...  
3  [Ptprs, Clcn6, Pde7a, Smndc1, Gzf1, Atg10, Erp...  
4  [Clcn6, Ptprs, Lin52, Erp29, Smndc1, Rnf146, M...  ,      TF                                 MotifID Enrichment  \
                                                       AUC   
0  Atf3        dbcorrdb__JUN__ENCSR000EGH_1__m1   0.068472   
1  Atf3       dbcorrdb__JUND__ENCSR000EGN_1__m1   0.072980   
2  Atf3                            cisbp__M5050   0.059033   
3  Atf3  dbcorrdb__eGFP-JUNB__ENCSR000DJY_1__m1   0.074232   
4  Atf3      dbcorrdb__FOSL1__ENCSR000BMV_1__m1   0.076303   

                                                      \
                                          Annotation   
0  gene is annotated for similar motif hocomoco__...   
1  gene is orthologous to ENSG00000162772 in H. s...   
2  gene is annotated for similar motif hocomoco__...   
3  gene is annotated for similar motif hocomoco__...   
4  gene is annotated for similar motif hocomoco__...   

                                                                            \
                                             Context MotifSimilarityQvalue   
0  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000773   
1  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000045   
2  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000608   
3  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000006   
4  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000004   

                                           \
        NES OrthologousIdentity RankAtMax   
0  4.024299            1.000000       481   
1  4.467018            0.950276      1112   
2  3.097418            1.000000       829   
3  4.589883            1.000000       634   
4  4.793295            1.000000       762   

                                                      \
                                         TargetGenes   
0  [(Tagln2, 1.6868254779790988), (Junb, 2.131165...   
1  [(Kdm6b, 1.6868254779790988), (Junb, 2.1311655...   
2  [(Junb, 1.6868254779790988), (Oser1, 2.1311655...   
3  [(Kdm6b, 1.6868254779790988), (Junb, 2.1311655...   
4  [(Junb, 1.6868254779790988), (Ptp4a2, 2.131165...   

                                      first_elements  
                                                      
0  [Tagln2, Junb, Pim1, Mir155hg, Kdm6b, Vcpip1, ...  
1  [Kdm6b, Junb, Tagln2, Dusp1, Mir155hg, Sec11c,...  
2  [Junb, Oser1, Tagln2, Lgals3, Bach1, Csrnp1, K...  
3  [Kdm6b, Junb, Ptp4a2, Sec11c, Lgals3, Pim1, Ta...  
4  [Junb, Ptp4a2, Pim1, Kdm6b, Sec11c, Vcpip1, Di...  ,        TF                           MotifID Enrichment  \
                                                   AUC   
0  Arid3a                      cisbp__M1879   0.066822   
1  Arid3a        swissregulon__hs__FOXA2.p3   0.066466   
2  Arid3a  homer__AAAGTAAACA_FOXA1_GSE26831   0.067375   
3    Arnt                      cisbp__M5633   0.069684   
4    Arnt                      cisbp__M5866   0.068370   

                                                      \
                                          Annotation   
0  gene is orthologous to ENSG00000116017 in H. s...   
1  gene is orthologous to ENSG00000116017 in H. s...   
2  motif similar to dbcorrdb__ARID3A__ENCSR000EDP...   
3  gene is annotated for similar motif transfac_p...   
4  gene is annotated for similar motif transfac_p...   

                                                                            \
                                             Context MotifSimilarityQvalue   
0  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000230   
1  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000230   
2  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000008   
3  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000417   
4  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000133   

                                           \
        NES OrthologousIdentity RankAtMax   
0  3.048012            0.798669       414   
1  3.020346            0.798669       352   
2  3.090921            0.809444       398   
3  3.726068            1.000000       945   
4  3.612595            1.000000       499   

                                                      \
                                         TargetGenes   
0  [(Arid3a, 1.0814455429211889), (Pogz, 0.624427...   
1  [(Arid3a, 1.0814455429211889), (Pogz, 0.624427...   
2  [(Pogz, 1.0814455429211889), (Itpr1, 0.6244276...   
3  [(Clcn6, 0.8095717882553205), (Kansl1, 1.58349...   
4  [(Clcn6, 0.8095717882553205), (Lin52, 1.583490...   

                                      first_elements  
                                                      
0  [Arid3a, Pogz, Ago4, Taf1b, Itpr1, Hmgb1, Sh3k...  
1  [Arid3a, Pogz, Ago4, Taf1b, Itpr1, Hmgb1, Sh3k...  
2  [Pogz, Itpr1, Arid3a, Sh3kbp1, Hmgb1, Clpx, Me...  
3  [Clcn6, Kansl1, Lin52, Ptprs, Asna1, Ccdc91, E...  
4  [Clcn6, Lin52, Ptprs, Kansl1, Asna1, Smndc1, R...  ]
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/71709787

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档