首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >用DESeq2进行带相互作用项的微分表达式分析

用DESeq2进行带相互作用项的微分表达式分析
EN

Stack Overflow用户
提问于 2021-10-08 12:38:49
回答 1查看 607关注 0票数 0

我是DESeq2的初学者。目前,我正在尝试使用不同的设计公式来分析来自生物导体封装airway的数据。

我按照DESeq2 vignette:RNA-seq工作流中的步骤来计算统计结果。但是,当我在设计公式中指定交互项时,会出现下面的错误消息。

代码语言:javascript
复制
Error in checkForExperimentalReplicates(object, modelMatrix) : 

  The design matrix has the same number of samples and coefficients to fit,
  so estimation of dispersion is not possible. Treating samples
  as replicates was deprecated in v1.20 and no longer supported since v1.22.

我的问题是,当我按照example("results")中的说明指定设计公式时,会发生错误。为什么会发生此错误,以及如何生成具有交互效果的结果?

如果有人能帮我解决这个问题,我会非常高兴的。

  1. package(airway)加载数据
代码语言:javascript
复制
> # Loading data
> library("airway")
> library("DESeq2")
> data(gse)
> gse
class: RangedSummarizedExperiment 
dim: 58294 8 
metadata(6): tximetaInfo quantInfo ... txomeInfo txdbInfo
assays(3): counts abundance length
rownames(58294): ENSG00000000003.14 ENSG00000000005.5 ... ENSG00000285993.1
  ENSG00000285994.1
rowData names(1): gene_id
colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
colData names(3): names donor condition
  1. 重命名并重新级别该变量
代码语言:javascript
复制
> # rename variable 
> (gse$cell <- gse$donor)
[1] N61311  N61311  N052611 N052611 N080611 N080611 N061011 N061011
Levels: N052611 N061011 N080611 N61311

> (gse$dex <- gse$condition)
[1] Untreated     Dexamethasone Untreated     Dexamethasone Untreated     Dexamethasone Untreated    
[8] Dexamethasone
Levels: Untreated Dexamethasone

> levels(gse$dex) = c("untrt", "trt")
> levels(gse$dex)
[1] "untrt" "trt"  
  1. 用设计公式DESeqDataSet建立~ cell + dex并进行分析。
代码语言:javascript
复制
> # building DESeqDataSet
> dds <- DESeqDataSet(gse, design = ~ cell + dex)
using counts and average transcript lengths from tximeta

> dds
class: DESeqDataSet 
dim: 58294 8 
metadata(7): tximetaInfo quantInfo ... txdbInfo version
assays(3): counts abundance avgTxLength
rownames(58294): ENSG00000000003.14 ENSG00000000005.5 ... ENSG00000285993.1
  ENSG00000285994.1
rowData names(1): gene_id
colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
colData names(5): names donor condition cell dex

> # Filtering 
> keep = rowSums(counts(dds)) > 1
> dds = dds[keep,]
> dim(dds)
[1] 31604     8

> # Defferential analysis
> design(dds)
~cell + dex

> dds = DESeq(dds)
estimating size factors
using 'avgTxLength' from assays(dds), correcting for library size
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing

> resultsNames(dds)
[1] "Intercept"               "cell_N061011_vs_N052611" "cell_N080611_vs_N052611"
[4] "cell_N61311_vs_N052611"  "dex_trt_vs_untrt"       

> results(dds, contrast = c("dex", "untrt", "trt"))
log2 fold change (MLE): dex untrt vs trt 
Wald test p-value: dex untrt vs trt 
DataFrame with 31604 rows and 6 columns
                     baseMean log2FoldChange     lfcSE      stat      pvalue       padj
                    <numeric>      <numeric> <numeric> <numeric>   <numeric>  <numeric>
ENSG00000000003.14 739.940717      0.3611537  0.106869  3.379419 0.000726392 0.00531137
ENSG00000000419.12 511.735722     -0.2063147  0.128665 -1.603509 0.108822318 0.29318870
ENSG00000000457.13 314.194855     -0.0378308  0.158633 -0.238479 0.811509461 0.92255697
ENSG00000000460.16  79.793622      0.1152590  0.314991  0.365912 0.714430444 0.87298038
ENSG00000000938.12   0.307267      1.3691185  3.503764  0.390757 0.695977205         NA
...                       ...            ...       ...       ...         ...        ...
ENSG00000285979.1   38.353886     -0.3423657  0.359511 -0.952310    0.340940   0.600750
ENSG00000285987.1    1.562508     -0.7064145  1.547295 -0.456548    0.647996         NA
ENSG00000285990.1    0.642315     -0.3647333  3.433276 -0.106235    0.915396         NA
ENSG00000285991.1   11.276284      0.1165515  0.748601  0.155692    0.876275   0.952921
ENSG00000285994.1    3.651041      0.0960094  1.068660  0.089841    0.928414         NA
  1. 用交互项~ cell + dex + cell:dex分析数据。

在这一步中,在我用交互术语~ cell + dex + cell:dex指定设计公式之后。当我试图在数据集中运行DESeq()函数时,会发生错误。

我使用的~ cell + dex + cell:dex设计公式与他们在example("results")中演示的交互设计公式相同。

代码语言:javascript
复制
> # Defferential analysis using interaction term 
> dds_int = dds
> design(dds_int) = formula(~ cell + dex + cell:dex)
> dds_int = DESeq(dds_int)
using pre-existing normalization factors
estimating dispersions
found already estimated dispersions, replacing these
Error in checkForExperimentalReplicates(object, modelMatrix) : 

  The design matrix has the same number of samples and coefficients to fit,
  so estimation of dispersion is not possible. Treating samples
  as replicates was deprecated in v1.20 and no longer supported since v1.22.
  1. 我试图在构建DESeqDataSet时对设计公式进行隔离。但是,当我尝试在数据集中运行DESeq()时,也会发生相同的错误。
代码语言:javascript
复制
> dds_int = DESeqDataSet(gse, design = ~ cell + dex + cell:dex)
using counts and average transcript lengths from tximeta
> dim(dds_int)
[1] 58294     8
> 
> keep = rowSums(counts(dds_int)) > 1
> dds_int = dds_int[keep,]
> dim(dds_int)
[1] 31604     8
> 
> design(dds_int)
~cell + dex + cell:dex
> 
> dds_int = DESeq(dds_int)
estimating size factors
using 'avgTxLength' from assays(dds), correcting for library size
estimating dispersions
Error in checkForExperimentalReplicates(object, modelMatrix) : 

  The design matrix has the same number of samples and coefficients to fit,
  so estimation of dispersion is not possible. Treating samples
  as replicates was deprecated in v1.20 and no longer supported since v1.22.
  1. 我尝试创建model.matrix并使用它运行DESeq分析。然而,同样的错误仍然发生。
代码语言:javascript
复制
> # model formula
> dds_int = dds
> attach(as.data.frame(colData(dds_int)))
The following objects are masked from as_data_frame(colData(dds_int)):

    cell, condition, dex, donor, names

> 
> mm = model.matrix( ~ cell + dex + cell:dex)
> design(dds_int) = mm
> design(dds_int)
  (Intercept) cellN061011 cellN080611 cellN61311 dextrt cellN061011:dextrt cellN080611:dextrt
1           1           0           0          1      0                  0                  0
2           1           0           0          1      1                  0                  0
3           1           0           0          0      0                  0                  0
4           1           0           0          0      1                  0                  0
5           1           0           1          0      0                  0                  0
6           1           0           1          0      1                  0                  1
7           1           1           0          0      0                  0                  0
8           1           1           0          0      1                  1                  0
  cellN61311:dextrt
1                 0
2                 1
3                 0
4                 0
5                 0
6                 0
7                 0
8                 0
attr(,"assign")
[1] 0 1 1 1 2 3 3 3
attr(,"contrasts")
attr(,"contrasts")$cell
[1] "contr.treatment"

attr(,"contrasts")$dex
[1] "contr.treatment"

> 
> dds_int = DESeq(dds_int, test="Wald", modelMatrixType = "standard")
using supplied model matrix
using pre-existing normalization factors
estimating dispersions
found already estimated dispersions, replacing these
Error in checkForExperimentalReplicates(object, modelMatrix) : 

  The design matrix has the same number of samples and coefficients to fit,
  so estimation of dispersion is not possible. Treating samples
  as replicates was deprecated in v1.20 and no longer supported since v1.22.

我在上述代码块中创建的交互术语的两个设计公式与在example(results)中演示的设计公式相同。我想知道为什么错误不断发生。如何生成具有交互作用的结果?

谢谢大家抽出时间。

EN

回答 1

Stack Overflow用户

发布于 2022-02-15 01:59:21

我也有类似的问题。我没有复制品。或者我有一个只有一个样本的案例/类

你可以通过把你的一些样本集中在一起来解决这个问题。例如,图像,您有6个时间序列样本。您可以创建一个因子,并将前3标记为“开始”,将后两个标记为“中间”,将最后两个标记为“结束”。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69495974

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档