文章/答案/技术大牛

发布

社区首页 >问答首页 >Phyloseq，如何通过merge_samples获得相对丰度？

问Phyloseq，如何通过merge_samples获得相对丰度？
EN

Stack Overflow用户

提问于 2019-11-27 05:29:37

回答 1查看 4.7K关注 0票数 1

我试图使用Phyloseq包的merge_sample选项获得相对丰富度。

当我用所有样本计算每个门的平均值时(我将以GlobalPatterns为例)；我的意思是，Globalpaters有26个样本，所以我做了一些类似的事情

library(phyloseq)
library(plyr)
data(GlobalPatterns)
TGroup <- tax_glom(GlobalPatterns, taxrank = "Phylum")
PGroup <- transform_sample_counts(TGroup, function(x)100* x / sum(x))
OTUg <- otu_table(PGroup)
TAXg <- tax_table(PGroup)[,"Phylum"]
AverageD <- as.data.frame(rowMeans(OTUg))
names(AverageD) <- c("Mean")
GTable <- merge(TAXg, AverageD, by=0, all=TRUE)
GTable$Row.names = NULL
GTable <- GTable[order(desc(GTable$Mean)),]
head(GTable)

我得到的东西是：

        Phylum           Mean

1 Proteobacteria      29.45550
2 Firmicutes          18.87905
3 Bacteroidetes       17.34374
4 Cyanobacteria       13.70639
5 Actinobacteria      8.93446
6....... More.....

我觉得没事的！

但是，当我托盘到mage merge_samples( by: SampleType)时：

    ps <- tax_glom(GlobalPatterns, "Phylum")
    ps0 <- transform_sample_counts(ps, function(x)100* x / sum(x))
    ps1 <- merge_samples(ps0, "SampleType")
    ps2 <- transform_sample_counts(ps1, function(x)100* x / sum(x))
    ps3 <- ps2
    otu_table(ps3) <- t(otu_table(ps3)) # transpose the matrix otus !!!
    OTUg <- otu_table(ps3)
    TAXg <- tax_table(ps3)[,"Phylum"]
    GTable <- merge(TAXg, OTUg, by=0, all=TRUE)
    GTable$Row.names = NULL
    GTable$Mean=rowMeans(GTable[,-c(1)], na.rm=TRUE)
    GTable <- GTable[order(desc(GTable$Mean)),]
   head(GTable)

我得到了相同的税，但平均一栏中的百分比不同：

  Phylum Feces Freshwater Freshwater Mock Ocean Sediment Skin Soil Tongue Mean
1 Proteobacteria  1.58 16.71 18.61 20.10 38.00 71.03 31.98 32.66 44.49 30.57
2 Firmicutes 54.82 0.12 0.65 41.42 0.08 2.53 30.67 0.64 21.67 16.96
3 Bacteroidetes 35.23 11.92 5.07 24.97 31.17 7.01 9.09 9.90 12.28 16.29
4 Cyanobacteria 2.63 30.17 62.57 0.16 19.18 3.24 4.65 0.97 6.61 14.46
5 Actinobacteria 3.47 37.11 1.74 8.39 5.12 1.04 16.78 9.99 7.49 10.13

在这一点上，merge_samples by SampleType，每一列(样本)将覆盖分类群，每一门的百分比将发生变化(粪便淡水淡水.)，我明白，但每个门的总平均数必须是相同的，即使我合并样本，在这种情况下，平均数是不同的(蛋白质细菌30.57，第16.9，细菌16.29)。

有什么解决办法或建议吗？

谢谢

bioinformatics

phyloseq

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-11-28 10:37:56

在第一部分中，你正在对所有样本采取手段。在第二种情况下，你采取的是分组手段。这两种情况只有在每个群体的观测次数相同时才是等价的。

例如：

# equal n for each group
abundance = seq(0.1,0.6,by=0.1)
group = rep(letters[1:3],each=2)
mean(tapply(abundance,group,mean)) == mean(abundance)
[1] TRUE

# unequal n
abundance = seq(0.1,0.6,by=0.1)
group = rep(letters[1:3],1:3)
mean(tapply(abundance,group,mean)) == mean(abundance)
[1] FALSE

您的n/ SampleType是不同的

TGroup <- tax_glom(GlobalPatterns, taxrank = "Phylum")
PGroup <- transform_sample_counts(TGroup, function(x)100* x / sum(x))
SampleType = sample_data(PGroup)$SampleType
table(SampleType)

SampleType
             Feces         Freshwater Freshwater (creek)               Mock 
                 4                  2                  3                  3 
             Ocean Sediment (estuary)               Skin               Soil 
                 3                  3                  3                  3 
            Tongue 
                 2

要获得不同样本的平均丰度，您需要找到每个SampleType的平均丰度，然后是平均值：

mean_PGroup = sapply(levels(SampleType),function(i){
  rowMeans(otu_table(PGroup)[,SampleType==i])
})

phy = tax_table(PGroup)[rownames(mean_PGroup ),"Phylum"]
rownames(mean_PGroup) = phy
head(sort(rowMeans(mean_PGroup),decreasing=TRUE))

 Proteobacteria      Firmicutes   Bacteroidetes   Cyanobacteria  Actinobacteria 
      30.572773       16.956254       16.293286       14.463643       10.126875 
Verrucomicrobia 
       2.774216

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/59063589

复制

相似问题

问Phyloseq，如何通过merge_samples获得相对丰度？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Phyloseq，如何通过merge_samples获得相对丰度？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Phyloseq，如何通过merge_samples获得相对丰度？
EN