文章/答案/技术大牛

发布

社区首页 >问答首页 >从熊猫数据中的所有元组列表中逐行提取第一个元素。

问从熊猫数据中的所有元组列表中逐行提取第一个元素。
EN

Stack Overflow用户

提问于 2022-04-01 15:59:39

回答 1查看 101关注 0票数 0

我有一个dataframes列表，所有dataframes中的第九列是一个元组列表。我试图从这个元组列表中提取第一个元素。如果在删除所有元组列表中的第二个元素后获得的元素/元素的数量小于5个，我希望删除该行。

但是，目前我无法只得到列表中所有元组中的第一个元素，我查看了堆栈溢出上的各种响应，但没有解决方案。

我在下面显示了部分数据，我尝试过的代码如下所示，

>>> type(motifs[0])
<class 'pandas.core.frame.DataFrame'>
>>> len(motifs)
100

>>> motifs[0]
                                                   Enrichment  ...                                                   
                                                          AUC  ...                                        TargetGenes
TF     MotifID                                                 ...                                                   
Arid3a tfdimers__MD00454                             0.074115  ...  [(Hmgb1, 1.1106060045583808), (Slc44a2, 0.4323...
Atf1   dbcorrdb__JUND__ENCSR000EGN_1__m1             0.079926  ...  [(Coq8b, 0.4451942964830318), (Tagln2, 0.56984...
Atf3   taipale_cyt_meth__JDP2_NRTGAYGTCAYN_FL_meth   0.058592  ...  [(Map1lc3a, 3.488720958149637), (Ccl4, 0.55845...
       taipale_cyt_meth__XBP1_NRTGACGTCAYN_FL        0.059979  ...  [(Map1lc3a, 3.488720958149637), (Dusp1, 0.5584...
       dbcorrdb__JUND__ENCSR000EGN_1__m1             0.059945  ...  [(Kdm6b, 3.488720958149637), (Junb, 0.55845389...
...                                                       ...  ...                                                ...
Zmiz1  dbcorrdb__POLR2A__ENCSR000BMR_1__m1           0.084186  ...  [(Egr1, 0.2689079225312428), (Sumo1, 0.2982820...
       dbcorrdb__HCFC1__ENCSR000ECH_1__m3            0.088241  ...  [(Egr1, 0.2689079225312428), (Sumo1, 0.2982820...
       dbcorrdb__GABPA__ENCSR000BIW_1__m1            0.082741  ...  [(Egr1, 0.2689079225312428), (Vps52, 0.2982820...
       dbcorrdb__GABPA__ENCSR000BLO_1__m1            0.081011  ...  [(Vps52, 0.2689079225312428), (Egr1, 0.2982820...
       dbcorrdb__POLR2A__ENCSR000EAY_1__m1           0.083258  ...  [(Sumo1, 0.2689079225312428), (Leprotl1, 0.298...

[15263 rows x 8 columns]
>>> motifs[1]
                                             Enrichment  ...                                                   
                                                    AUC  ...                                        TargetGenes
TF       MotifID                                         ...                                                   
AU041133 transfac_pro__M06033                  0.061555  ...  [(Topors, 0.9542964293512636), (Tm9sf3, 0.8081...
Arid3a   tfdimers__MD00454                     0.055638  ...  [(Hmgb1, 1.0336516736519146), (Zfp771, 1.24306...
Atf1     tfdimers__MD00439                     0.078748  ...  [(Mef2c, 0.4349350423233438), (Hcfc1, 1.0), (M...
Atf3     dbcorrdb__JUN__ENCSR000EGH_1__m1      0.065025  ...  [(Smox, 0.7721842224335954), (Junb, 3.41419581...
         dbcorrdb__JUND__ENCSR000EGN_1__m1     0.074146  ...  [(Kdm6b, 0.7721842224335954), (Smox, 3.4141958...
...                                                 ...  ...                                                ...
Zmiz1    dbcorrdb__POLR2A__ENCSR000BMR_1__m1   0.085257  ...  [(Egr1, 0.000962868898130634), (Sumo1, 0.39039...
         dbcorrdb__HCFC1__ENCSR000ECH_1__m3    0.093355  ...  [(Lypla2, 0.000962868898130634), (Egr1, 0.3903...
         dbcorrdb__GABPA__ENCSR000BIW_1__m1    0.089414  ...  [(Egr1, 0.000962868898130634), (Vps52, 0.39039...
         dbcorrdb__GABPA__ENCSR000BLO_1__m1    0.085608  ...  [(Lypla2, 0.000962868898130634), (Mon1b, 0.390...
         dbcorrdb__POLR2A__ENCSR000EAY_1__m1   0.078761  ...  [(Sumo1, 0.000962868898130634), (Lypla2, 0.390...

[15442 rows x 8 columns]
# removing multi_index for the list of dataframes
>>> [df.reset_index(inplace=True) for df in motifs]
[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]
# The head of the columns containing list of tuples
>>> motifs[1][('Enrichment',           'TargetGenes')].head()
0    [(Topors, 0.9542964293512636), (Tm9sf3, 0.8081...
1    [(Hmgb1, 1.0336516736519146), (Zfp771, 1.24306...
2    [(Mef2c, 0.4349350423233438), (Hcfc1, 1.0), (M...
3    [(Smox, 0.7721842224335954), (Junb, 3.41419581...
4    [(Kdm6b, 0.7721842224335954), (Smox, 3.4141958...
Name: (Enrichment, TargetGenes), dtype: object
# If I try to geth the first element from the list of tuples, using the code below then it again gives the first tuple as given below,
>>> motifs[1][('Enrichment',           'TargetGenes')] = [ seq[0] for seq in motifs[1][('Enrichment',           'TargetGenes')] ]
>>> motifs[1][('Enrichment',           'TargetGenes')].head()
0    (Topors, 0.9542964293512636)
1     (Hmgb1, 1.0336516736519146)
2     (Mef2c, 0.4349350423233438)
3      (Smox, 0.7721842224335954)
4     (Kdm6b, 0.7721842224335954)
Name: (Enrichment, TargetGenes), dtype: object


# If I try another method using the same column for fifth dataframe then I get the following result as given below,

    >>> motifs[5][('Enrichment',           'TargetGenes')] = [(tup[0],) for tup in motifs[5][('Enrichment',           'TargetGenes')] ]
>>> motifs[5][('Enrichment',           'TargetGenes')].head()
0    ((Tagln2, 2.9989559716809815),)
1     ((Kdm6b, 2.9989559716809815),)
2     ((Kdm6b, 2.9989559716809815),)
3      ((Junb, 2.9989559716809815),)
4     ((Kdm6b, 2.9989559716809815),)
Name: (Enrichment, TargetGenes), dtype: object
>>>

所需的输出如下，

>>> motifs[5][('Enrichment',           'TargetGenes')].head()
0    ['Slc39a9', 'Arpc2', 'Arpc2', 'Arpc2', 'Phrf1']
1    ['Slc39a9', 'Arpc2', 'Arpc2', 'Slc39a9', 'Arpc2', 'Arpc2', 'Arpc2', 'Phrf1', 'Pafah1b1', 'Arpc2']
2    ['Supt16', 'Polr2m', 'Zfp668', 'Abl1', 'Thap1', 'Tia1', 'Cenpl']

因此，是否有可能从所有数据格式中名为TargetGenes的列中从元组列表中提取第一个元素的列表，就像我在期望的输出中显示的那样？

更新1

我为少数数据提供了df.head(5).to_dict()的输出，

>>> motifs[9].head(5).to_dict()
{('TF', ''): {0: 'Arid3a', 1: 'Arnt', 2: 'Arnt', 3: 'Arnt', 4: 'Arnt'}, ('MotifID', ''): {0: 'tfdimers__MD00454', 1: 'taipale_cyt_meth__SREBF1_NTCACGTGAN_eDBD', 2: 'cisbp__M4597', 3: 'hocomoco__ATF3_HUMAN.H11MO.0.A', 4: 'cisbp__M4552'}, ('Enrichment', 'AUC'): {0: 0.06471430725162068, 1: 0.06095155535454042, 2: 0.07011658877330519, 3: 0.06705738981858385, 4: 0.06247801397055128}, ('Enrichment', 'Annotation'): {0: 'motif is annotated for orthologous gene ENSG00000116017 in H. sapiens (identity = 80%)', 1: "motif similar to transfac_public__M00539 ('V$ARNT_02: Arnt'; q-value = 3.13e-05) which is directly annotated", 2: "gene is annotated for similar motif transfac_public__M00539 ('V$ARNT_02: Arnt'; q-value = 0.000799)", 3: "gene is annotated for similar motif transfac_public__M00539 ('V$ARNT_02: Arnt'; q-value = 0.000575)", 4: "gene is annotated for similar motif transfac_public__M00539 ('V$ARNT_02: Arnt'; q-value = 0.000358)"}, ('Enrichment', 'Context'): {0: frozenset({'weight>75.0%', 'activating', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 1: frozenset({'weight>75.0%', 'activating', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 2: frozenset({'weight>75.0%', 'activating', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 3: frozenset({'weight>75.0%', 'activating', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 4: frozenset({'weight>75.0%', 'activating', 'mm10__refseq-r80__10kb_up_and_down_tss'})}, ('Enrichment', 'MotifSimilarityQvalue'): {0: 0.0, 1: 3.1e-05, 2: 0.000799, 3: 0.000575, 4: 0.00035800000000000003}, ('Enrichment', 'NES'): {0: 3.326402558504723, 1: 3.1209030910033024, 2: 3.922071066278296, 3: 3.654648993653949, 4: 3.2543395666659647}, ('Enrichment', 'OrthologousIdentity'): {0: 0.8094439999999999, 1: 1.0, 2: 1.0, 3: 1.0, 4: 1.0}, ('Enrichment', 'RankAtMax'): {0: 1185, 1: 298, 2: 901, 3: 865, 4: 4637}, ('Enrichment', 'TargetGenes'): {0: [('Hmgb1', 0.745314226221018), ('Zfp771', 0.6764829824966149), ('Irgc1', 1.9951670755270587), ('Bcl11a', 0.4856052689262107), ('Sh3kbp1', 0.5933072140052049), ('Traf3', 2.7600863350248512), ('Mars', 0.4505749371997108), ('Slc6a6', 1.0), ('Mlec', 0.39775865366894697), ('Rps6kb1', 0.40770958455266104), ('Slc12a4', 0.8671975714781245), ('Clic4', 0.7094675790094807), ('Lat2', 0.40522588119023456), ('Mcl1', 0.4268571683991914), ('Ptprj', 0.9892910773852126), ('Med27', 0.3965364187198045), ('Eif3a', 0.5472475711288725)], 1: [('Clcn6', 0.5838135470801639), ('Ptprs', 2.580731143355787), ('Erp29', 0.4427625162377926), ('Lin52', 0.4446103752969262), ('Smndc1', 0.5501206802490346), ('Scarb1', 1.038675980787723), ('Rnf146', 0.8398798839169821)], 2: [('Ptprs', 0.5838135470801639), ('Clcn6', 2.580731143355787), ('Pde7a', 0.4427625162377926), ('Smndc1', 0.4446103752969262), ('Ppp2r2a', 0.5501206802490346), ('Gzf1', 1.038675980787723), ('Paf1', 0.8398798839169821), ('Erp29', 0.9122832235342808), ('Ywhah', 1.0), ('Lin52', 0.6065115546339283), ('Atg10', 0.7179666115646837), ('Rnf146', 0.4719188766630129), ('Hlx', 0.4350102779899021), ('Mafk', 0.7611670711498808), ('Atg5', 1.5656437019255856)], 3: [('Ptprs', 0.5838135470801639), ('Clcn6', 2.580731143355787), ('Pde7a', 0.4427625162377926), ('Smndc1', 0.4446103752969262), ('Gzf1', 0.5501206802490346), ('Atg10', 1.038675980787723), ('Erp29', 0.8398798839169821), ('Paf1', 0.9122832235342808), ('Mff', 1.0), ('Ppp2r2a', 0.6065115546339283), ('Atg5', 0.7179666115646837), ('Rab1a', 0.4719188766630129), ('Rnf146', 0.4350102779899021), ('Mafk', 0.7611670711498808), ('Lin52', 1.5656437019255856), ('Hlx', 0.5914337023692341)], 4: [('Clcn6', 0.5838135470801639), ('Ptprs', 2.580731143355787), ('Lin52', 0.4427625162377926), ('Erp29', 0.4446103752969262), ('Smndc1', 0.5501206802490346), ('Rnf146', 1.038675980787723), ('Mff', 0.8398798839169821), ('Pde7a', 0.9122832235342808), ('Atg5', 1.0), ('Atg10', 0.6065115546339283), ('Hlx', 0.7179666115646837), ('Mlx', 0.4719188766630129), ('Ppp2r2a', 0.4350102779899021), ('Atp1a1', 0.7611670711498808), ('Mcmbp', 1.5656437019255856), ('Paf1', 0.5914337023692341), ('Mafk', 1.8757251159707784), ('Ywhah', 0.4148168160950648), ('Ykt6', 0.8740363421300391), ('Gzf1', 1.6749018097542459), ('Itpr1', 0.6244407603393514), ('Sec24c', 0.8125260569274086), ('Atp1b1', 1.3433579468658023), ('Cracr2a', 1.9825295293378795), ('Rabl6', 1.6060242452401532), ('Glo1', 4.075255658782804), ('Kat7', 2.1993521341931785), ('Mxd4', 1.546869996844828), ('Rab1a', 4.052034183647333), ('Taok3', 1.4156879591756044), ('Lonp2', 3.866232617909616), ('Bmp2k', 0.5805201605958586), ('Kcnn4', 0.7230752540573253), ('Nrip1', 0.4565406766743578), ('Hexb', 0.8850971245380614), ('Slc31a1', 5.410182658990805), ('Oat', 2.4192511357615585)]}}
>>> motifs[10].head(5).to_dict()
/*
* 提示：该行代码过长，系统自动注释不进行高亮。一键复制会移除系统注释 
* {('TF', ''): {0: 'Atf3', 1: 'Atf3', 2: 'Atf3', 3: 'Atf3', 4: 'Atf3'}, ('MotifID', ''): {0: 'dbcorrdb__JUN__ENCSR000EGH_1__m1', 1: 'dbcorrdb__JUND__ENCSR000EGN_1__m1', 2: 'cisbp__M5050', 3: 'dbcorrdb__eGFP-JUNB__ENCSR000DJY_1__m1', 4: 'dbcorrdb__FOSL1__ENCSR000BMV_1__m1'}, ('Enrichment', 'AUC'): {0: 0.06847185815248727, 1: 0.07298037887028418, 2: 0.05903279302412667, 3: 0.07423158995940253, 4: 0.07630307245136325}, ('Enrichment', 'Annotation'): {0: "gene is annotated for similar motif hocomoco__ATF3_MOUSE.H11MO.0.A ('ATF3_MOUSE'; q-value = 0.000773)", 1: "gene is orthologous to ENSG00000162772 in H. sapiens (identity = 95%) which is annotated for similar motif homer__DATGASTCATHN_Atf3 ('Atf3(bZIP)/GBM-ATF3-ChIP-Seq(GSE33912)/Homer'; q-value = 4.47e-05)", 2: "gene is annotated for similar motif hocomoco__ATF3_MOUSE.H11MO.0.A ('ATF3_MOUSE'; q-value = 0.000608)", 3: "gene is annotated for similar motif hocomoco__ATF3_MOUSE.H11MO.0.A ('ATF3_MOUSE'; q-value = 6.26e-06)", 4: "gene is annotated for similar motif hocomoco__ATF3_MOUSE.H11MO.0.A ('ATF3_MOUSE'; q-value = 3.66e-06)"}, ('Enrichment', 'Context'): {0: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 1: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 2: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 3: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 4: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'})}, ('Enrichment', 'MotifSimilarityQvalue'): {0: 0.000773, 1: 4.5e-05, 2: 0.000608, 3: 6e-06, 4: 4e-06}, ('Enrichment', 'NES'): {0: 4.024298594227186, 1: 4.467018476489827, 2: 3.0974176805382267, 3: 4.589882728587765, 4: 4.793294566384112}, ('Enrichment', 'OrthologousIdentity'): {0: 1.0, 1: 0.950276, 2: 1.0, 3: 1.0, 4: 1.0}, ('Enrichment', 'RankAtMax'): {0: 481, 1: 1112, 2: 829, 3: 634, 4: 762}, ('Enrichment', 'TargetGenes'): {0: [('Tagln2', 1.6868254779790988), ('Junb', 2.131165507779861), ('Pim1', 0.5626962771519949), ('Mir155hg', 4.215511908233003), ('Kdm6b', 1.3831692473783712), ('Vcpip1', 0.4655884981482655), ('Ptp4a2', 0.4608012609224432), ('Lgals3', 1.2986893071734795), ('Dusp1', 5.525777129178691), ('Akt3', 2.302534028919806), ('Isg20', 1.565796075237834), ('Sec11c', 2.5799875669226298), ('Gpx1', 0.7797457421907137), ('Pmepa1', 1.0), ('Diaph2', 0.4567503652363437), ('Gadd45b', 0.4041840201626749), ('Traf1', 2.0641638640138207), ('Tnfaip8', 0.4166028876535105), ('Fam110a', 0.5565365664603831), ('Smim3', 4.4918400769026645)], 1: [('Kdm6b', 1.6868254779790988), ('Junb', 2.131165507779861), ('Tagln2', 0.5626962771519949), ('Dusp1', 4.215511908233003), ('Mir155hg', 1.3831692473783712), ('Sec11c', 0.4655884981482655), ('Ccnd2', 0.4608012609224432), ('Lgals3', 1.2986893071734795), ('Bach1', 5.525777129178691), ('Vcpip1', 2.302534028919806), ('Pim1', 1.565796075237834), ('Cdkn1a', 2.5799875669226298), ('Gadd45b', 0.7797457421907137), ('Akt3', 1.0), ('Diaph2', 0.4567503652363437), ('Zfp710', 0.4041840201626749), ('Ncoa3', 2.0641638640138207), ('Ptp4a2', 0.4166028876535105), ('Atf3', 0.5565365664603831), ('Traf1', 4.4918400769026645), ('Pkib', 0.6208941839583779), ('Isg20', 7.928134177072506), ('Abr', 21.31142622147593), ('Tnfaip8', 6.271477001021822), ('Ccr9', 1.7224099621172309), ('Klf6', 2.934167135195324), ('Cdc42ep4', 0.5109519744748661), ('Ncf2', 18.859900155945674), ('Psap', 0.7982368206818751), ('Txndc5', 24.13078778816305), ('Rps6ka1', 9.17079179660625), ('Sipa1l1', 2.302124705475682), ('Smim3', 6.291659684538216), ('Tgif1', 3.5504062994628045)], 2: [('Junb', 1.6868254779790988), ('Oser1', 2.131165507779861), ('Tagln2', 0.5626962771519949), ('Lgals3', 4.215511908233003), ('Bach1', 1.3831692473783712), ('Csrnp1', 0.4655884981482655), ('Kdm6b', 0.4608012609224432), ('Vcpip1', 1.2986893071734795), ('Gpx1', 5.525777129178691), ('Akt3', 2.302534028919806), ('Pim1', 1.565796075237834), ('Cdkn1a', 2.5799875669226298), ('Prnp', 0.7797457421907137), ('Klf6', 1.0), ('Ptp4a2', 0.4567503652363437), ('Rab8b', 0.4041840201626749), ('Pfn1', 2.0641638640138207), ('Mir155hg', 0.4166028876535105), ('Pmepa1', 0.5565365664603831), ('Dusp1', 4.4918400769026645), ('Abr', 0.6208941839583779), ('Fyb', 7.928134177072506), ('Tgif1', 21.31142622147593), ('Isg20', 6.271477001021822)], 3: [('Kdm6b', 1.6868254779790988), ('Junb', 2.131165507779861), ('Ptp4a2', 0.5626962771519949), ('Sec11c', 4.215511908233003), ('Lgals3', 1.3831692473783712), ('Pim1', 0.4655884981482655), ('Tagln2', 0.4608012609224432), ('Diaph2', 1.2986893071734795), ('Vcpip1', 5.525777129178691), ('Akt3', 2.302534028919806), ('Cdkn1a', 1.565796075237834), ('Mir155hg', 2.5799875669226298), ('Isg20', 0.7797457421907137), ('Gpx1', 1.0), ('Bach1', 0.4567503652363437), ('Txndc5', 0.4041840201626749), ('Ncf2', 2.0641638640138207), ('Dusp1', 0.4166028876535105), ('Pmepa1', 0.5565365664603831), ('Oser1', 4.4918400769026645), ('Fam110a', 0.6208941839583779), ('Rps6ka1', 7.928134177072506), ('Klf6', 21.31142622147593), ('Zfp710', 6.271477001021822), ('Bhlhe40', 1.7224099621172309), ('Tgif1', 2.934167135195324)], 4: [('Junb', 1.6868254779790988), ('Ptp4a2', 2.131165507779861), ('Pim1', 0.5626962771519949), ('Kdm6b', 4.215511908233003), ('Sec11c', 1.3831692473783712), ('Vcpip1', 0.4655884981482655), ('Diaph2', 0.4608012609224432), ('Mir155hg', 1.2986893071734795), ('Lgals3', 5.525777129178691), ('Bach1', 2.302534028919806), ('Akt3', 1.565796075237834), ('Tagln2', 2.5799875669226298), ('Isg20', 0.7797457421907137), ('Cdkn1a', 1.0), ('Bhlhe40', 0.4567503652363437), ('Gadd45b', 0.4041840201626749), ('Pmepa1', 2.0641638640138207), ('Gpx1', 0.4166028876535105), ('Txndc5', 0.5565365664603831), ('Ncf2', 4.4918400769026645), ('Csrnp1', 0.6208941839583779), ('Sipa1l1', 7.928134177072506), ('Klf6', 21.31142622147593), ('Zfp710', 6.271477001021822), ('Fam110a', 1.7224099621172309), ('Atf3', 2.934167135195324), ('Smim3', 0.5109519744748661), ('Ncoa3', 18.859900155945674)]}}
*/
>>> motifs[11].head(5).to_dict()
{('TF', ''): {0: 'Arid3a', 1: 'Arid3a', 2: 'Arid3a', 3: 'Arnt', 4: 'Arnt'}, ('MotifID', ''): {0: 'cisbp__M1879', 1: 'swissregulon__hs__FOXA2.p3', 2: 'homer__AAAGTAAACA_FOXA1_GSE26831', 3: 'cisbp__M5633', 4: 'cisbp__M5866'}, ('Enrichment', 'AUC'): {0: 0.0668223211428239, 1: 0.06646591603386576, 2: 0.06737511274039161, 3: 0.06968363311646894, 4: 0.06836969001148106}, ('Enrichment', 'Annotation'): {0: "gene is orthologous to ENSG00000116017 in H. sapiens (identity = 79%) which is annotated for similar motif dbcorrdb__ARID3A__ENCSR000EDP_1__m1 ('ARID3A (ENCSR000EDP-1, motif 1)'; q-value = 0.00023)", 1: "gene is orthologous to ENSG00000116017 in H. sapiens (identity = 79%) which is annotated for similar motif dbcorrdb__ARID3A__ENCSR000EDP_1__m1 ('ARID3A (ENCSR000EDP-1, motif 1)'; q-value = 0.00023)", 2: "motif similar to dbcorrdb__ARID3A__ENCSR000EDP_1__m1 ('ARID3A (ENCSR000EDP-1, motif 1)'; q-value = 8.23e-06) which is annotated for orthologous gene ENSG00000116017 in H. sapiens (identity = 80%)", 3: "gene is annotated for similar motif transfac_public__M00539 ('V$ARNT_02: Arnt'; q-value = 0.000417)", 4: "gene is annotated for similar motif transfac_public__M00539 ('V$ARNT_02: Arnt'; q-value = 0.000133)"}, ('Enrichment', 'Context'): {0: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 1: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 2: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 3: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'}), 4: frozenset({'activating', 'weight>75.0%', 'mm10__refseq-r80__10kb_up_and_down_tss'})}, ('Enrichment', 'MotifSimilarityQvalue'): {0: 0.00023, 1: 0.00023, 2: 8e-06, 3: 0.000417, 4: 0.000133}, ('Enrichment', 'NES'): {0: 3.048011522275345, 1: 3.020345923942252, 2: 3.090921429894018, 3: 3.726068046226016, 4: 3.6125945303641087}, ('Enrichment', 'OrthologousIdentity'): {0: 0.798669, 1: 0.798669, 2: 0.8094439999999999, 3: 1.0, 4: 1.0}, ('Enrichment', 'RankAtMax'): {0: 414, 1: 352, 2: 398, 3: 945, 4: 499}, ('Enrichment', 'TargetGenes'): {0: [('Arid3a', 1.0814455429211889), ('Pogz', 0.6244276987659271), ('Ago4', 0.9664526956346918), ('Taf1b', 0.44722261016464504), ('Itpr1', 0.8313950646937135), ('Hmgb1', 1.9945139689034008), ('Sh3kbp1', 1.0), ('Cd180', 0.6042623259696077)], 1: [('Arid3a', 1.0814455429211889), ('Pogz', 0.6244276987659271), ('Ago4', 0.9664526956346918), ('Taf1b', 0.44722261016464504), ('Itpr1', 0.8313950646937135), ('Hmgb1', 1.9945139689034008), ('Sh3kbp1', 1.0), ('Cd180', 0.6042623259696077)], 2: [('Pogz', 1.0814455429211889), ('Itpr1', 0.6244276987659271), ('Arid3a', 0.9664526956346918), ('Sh3kbp1', 0.44722261016464504), ('Hmgb1', 0.8313950646937135), ('Clpx', 1.9945139689034008), ('Med27', 1.0), ('Fgfr1op2', 0.6042623259696077)], 3: [('Clcn6', 0.8095717882553205), ('Kansl1', 1.5834902996396047), ('Lin52', 1.7464790457683428), ('Ptprs', 1.18907063271503), ('Asna1', 0.4109458104482189), ('Ccdc91', 1.064875281051844), ('Erp29', 1.2944573975829907), ('Zfas1', 1.0), ('Rnf146', 0.5426386495200634), ('Smndc1', 1.3368937988306546), ('Pabpc1', 1.4072285212815487), ('Cracr2a', 2.3364193374078432), ('Mafk', 0.6752603264576597), ('Mcmbp', 0.3974384266129632)], 4: [('Clcn6', 0.8095717882553205), ('Lin52', 1.5834902996396047), ('Ptprs', 1.7464790457683428), ('Kansl1', 1.18907063271503), ('Asna1', 0.4109458104482189), ('Smndc1', 1.064875281051844), ('Rnf146', 1.2944573975829907), ('Dnajc13', 1.0), ('Ccdc91', 0.5426386495200634), ('Erp29', 1.3368937988306546)]}}
>>>

谢谢,

python-3.x

pandas

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-04-03 17:16:16

IIUC，你可以理解一下：

for df in motifs:
    df['first_elements'] = df.iloc[:, 9].apply(lambda li: [x[0] for x in li])

输出：

[       TF                                   MotifID Enrichment  \
                                                           AUC   
0  Arid3a                         tfdimers__MD00454   0.064714   
1    Arnt  taipale_cyt_meth__SREBF1_NTCACGTGAN_eDBD   0.060952   
2    Arnt                              cisbp__M4597   0.070117   
3    Arnt            hocomoco__ATF3_HUMAN.H11MO.0.A   0.067057   
4    Arnt                              cisbp__M4552   0.062478   

                                                      \
                                          Annotation   
0  motif is annotated for orthologous gene ENSG00...   
1  motif similar to transfac_public__M00539 ('V$A...   
2  gene is annotated for similar motif transfac_p...   
3  gene is annotated for similar motif transfac_p...   
4  gene is annotated for similar motif transfac_p...   

                                                                            \
                                             Context MotifSimilarityQvalue   
0  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000000   
1  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000031   
2  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000799   
3  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000575   
4  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000358   

                                           \
        NES OrthologousIdentity RankAtMax   
0  3.326403            0.809444      1185   
1  3.120903            1.000000       298   
2  3.922071            1.000000       901   
3  3.654649            1.000000       865   
4  3.254340            1.000000      4637   

                                                      \
                                         TargetGenes   
0  [(Hmgb1, 0.745314226221018), (Zfp771, 0.676482...   
1  [(Clcn6, 0.5838135470801639), (Ptprs, 2.580731...   
2  [(Ptprs, 0.5838135470801639), (Clcn6, 2.580731...   
3  [(Ptprs, 0.5838135470801639), (Clcn6, 2.580731...   
4  [(Clcn6, 0.5838135470801639), (Ptprs, 2.580731...   

                                      first_elements  
                                                      
0  [Hmgb1, Zfp771, Irgc1, Bcl11a, Sh3kbp1, Traf3,...  
1  [Clcn6, Ptprs, Erp29, Lin52, Smndc1, Scarb1, R...  
2  [Ptprs, Clcn6, Pde7a, Smndc1, Ppp2r2a, Gzf1, P...  
3  [Ptprs, Clcn6, Pde7a, Smndc1, Gzf1, Atg10, Erp...  
4  [Clcn6, Ptprs, Lin52, Erp29, Smndc1, Rnf146, M...  ,      TF                                 MotifID Enrichment  \
                                                       AUC   
0  Atf3        dbcorrdb__JUN__ENCSR000EGH_1__m1   0.068472   
1  Atf3       dbcorrdb__JUND__ENCSR000EGN_1__m1   0.072980   
2  Atf3                            cisbp__M5050   0.059033   
3  Atf3  dbcorrdb__eGFP-JUNB__ENCSR000DJY_1__m1   0.074232   
4  Atf3      dbcorrdb__FOSL1__ENCSR000BMV_1__m1   0.076303   

                                                      \
                                          Annotation   
0  gene is annotated for similar motif hocomoco__...   
1  gene is orthologous to ENSG00000162772 in H. s...   
2  gene is annotated for similar motif hocomoco__...   
3  gene is annotated for similar motif hocomoco__...   
4  gene is annotated for similar motif hocomoco__...   

                                                                            \
                                             Context MotifSimilarityQvalue   
0  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000773   
1  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000045   
2  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000608   
3  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000006   
4  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000004   

                                           \
        NES OrthologousIdentity RankAtMax   
0  4.024299            1.000000       481   
1  4.467018            0.950276      1112   
2  3.097418            1.000000       829   
3  4.589883            1.000000       634   
4  4.793295            1.000000       762   

                                                      \
                                         TargetGenes   
0  [(Tagln2, 1.6868254779790988), (Junb, 2.131165...   
1  [(Kdm6b, 1.6868254779790988), (Junb, 2.1311655...   
2  [(Junb, 1.6868254779790988), (Oser1, 2.1311655...   
3  [(Kdm6b, 1.6868254779790988), (Junb, 2.1311655...   
4  [(Junb, 1.6868254779790988), (Ptp4a2, 2.131165...   

                                      first_elements  
                                                      
0  [Tagln2, Junb, Pim1, Mir155hg, Kdm6b, Vcpip1, ...  
1  [Kdm6b, Junb, Tagln2, Dusp1, Mir155hg, Sec11c,...  
2  [Junb, Oser1, Tagln2, Lgals3, Bach1, Csrnp1, K...  
3  [Kdm6b, Junb, Ptp4a2, Sec11c, Lgals3, Pim1, Ta...  
4  [Junb, Ptp4a2, Pim1, Kdm6b, Sec11c, Vcpip1, Di...  ,        TF                           MotifID Enrichment  \
                                                   AUC   
0  Arid3a                      cisbp__M1879   0.066822   
1  Arid3a        swissregulon__hs__FOXA2.p3   0.066466   
2  Arid3a  homer__AAAGTAAACA_FOXA1_GSE26831   0.067375   
3    Arnt                      cisbp__M5633   0.069684   
4    Arnt                      cisbp__M5866   0.068370   

                                                      \
                                          Annotation   
0  gene is orthologous to ENSG00000116017 in H. s...   
1  gene is orthologous to ENSG00000116017 in H. s...   
2  motif similar to dbcorrdb__ARID3A__ENCSR000EDP...   
3  gene is annotated for similar motif transfac_p...   
4  gene is annotated for similar motif transfac_p...   

                                                                            \
                                             Context MotifSimilarityQvalue   
0  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000230   
1  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000230   
2  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000008   
3  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000417   
4  (activating, mm10__refseq-r80__10kb_up_and_dow...              0.000133   

                                           \
        NES OrthologousIdentity RankAtMax   
0  3.048012            0.798669       414   
1  3.020346            0.798669       352   
2  3.090921            0.809444       398   
3  3.726068            1.000000       945   
4  3.612595            1.000000       499   

                                                      \
                                         TargetGenes   
0  [(Arid3a, 1.0814455429211889), (Pogz, 0.624427...   
1  [(Arid3a, 1.0814455429211889), (Pogz, 0.624427...   
2  [(Pogz, 1.0814455429211889), (Itpr1, 0.6244276...   
3  [(Clcn6, 0.8095717882553205), (Kansl1, 1.58349...   
4  [(Clcn6, 0.8095717882553205), (Lin52, 1.583490...   

                                      first_elements  
                                                      
0  [Arid3a, Pogz, Ago4, Taf1b, Itpr1, Hmgb1, Sh3k...  
1  [Arid3a, Pogz, Ago4, Taf1b, Itpr1, Hmgb1, Sh3k...  
2  [Pogz, Itpr1, Arid3a, Sh3kbp1, Hmgb1, Clpx, Me...  
3  [Clcn6, Kansl1, Lin52, Ptprs, Asna1, Ccdc91, E...  
4  [Clcn6, Lin52, Ptprs, Kansl1, Asna1, Smndc1, R...  ]

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/71709787

复制

相似问题

问从熊猫数据中的所有元组列表中逐行提取第一个元素。
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从熊猫数据中的所有元组列表中逐行提取第一个元素。EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从熊猫数据中的所有元组列表中逐行提取第一个元素。
EN