首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用匹配的数据帧列运行R错误

使用匹配的数据帧列运行R错误
EN

Stack Overflow用户
提问于 2015-11-28 00:11:48
回答 1查看 170关注 0票数 0

我有一个类似于(gwas.data)的数据框架:

代码语言:javascript
复制
              SNP CHR        BP A1 A2 zscore      P CEUmaf    MAF
1       rs1000000  12 125456933  A  G  1.441 0.1496 0.3729 0.2401
563090 rs10000010   4  21227772  T  C  0.068 0.9455  0.575 0.4934
563091 rs10000023   4  95952929  T  G  1.217 0.2236 0.5917 0.3852
563092  rs1000003   3  99825597  A  G -0.306 0.7597  0.875 0.1794
563093 rs10000033   4 139819348  T  C  1.050 0.2935 0.4917 0.4789
2      rs10000037   4  38600725  A  G  0.072 0.9428 0.2833 0.2296

我还有一个看起来像(正确的方向):

代码语言:javascript
复制
        CHR        SNP A1 A2    MAF NCHROBS
6952148  12  rs1000000  A  G 0.2401     758
2272221   4 rs10000010  C  T 0.4934     758
2524810   4 rs10000023  G  T 0.3852     758
1838654   3  rs1000003  G  A 0.1794     758
2675630   4 rs10000033  C  T 0.4789     758
2338861   4 rs10000037  A  G 0.2296     758

我正在尝试用(1-MAF)替换gwas.data$MAF (如果是A1和A2 )并在两个数据帧之间切换的程序。我试图在这里使用这行代码,这是我从其他人那里借用的:

代码语言:javascript
复制
    flip <- gwas.data$A1 == correct.orientation$A2 & gwas.data$A2 == correct.orientation$A1
    dont.flip <- gwas.data$A1 == correct.orientation$A1 & gwas.data$A2 == correct.orientation$A2

    for ( i in 1 : nrow ( gwas.data ) ) {
        if ( flip [ i ] ) {
            gwas.data$A1 [ i ] <- correct.orientation$A1 [ i ]
            gwas.data$A2 [ i ] <- correct.orientation$A2 [ i ]
            gwas.data$zscore [ i ] <- - gwas.data$EFF [ i ]
            gwas.data$MAF [ i ] <- 1 - gwas.data$FRQ [ i ]
        } else if ( dont.flip [ i ] ) {
            #do nothing
        } else {
            stop ( "Strand Issue")      
        }

我在第一行flip <- gwas.data$A1 == correct.orientation$A2 & gwas.data$A2 == correct.orientation$A1遇到了错误,错误是Error in Ops.factor(gwas.data$A1, correct.orientation$A2) : level sets of factors are different,如何解决这个问题?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-11-28 03:05:57

考虑放弃使用for循环,并使用两个数据格式的基本R 合并()函数。但是,需要进行少量的数据管理: 1)暂时将因素转换为字符(或在read.csv()read.table()中使用read.csv());2)为重复列名添加后缀。一旦计算出的MAF用ifelse()完成,将合并的数据帧拆分,并将列名和数据类型重置为原始结构:

代码语言:javascript
复制
# CONVERT FACTORS TO CHARACTER
gwas.data[, c("A1","A2")] <- sapply(gwas.data[,c("A1","A2")],as.character)
# SUFFIXING COL NAMES TO IDENTIFY IN MERGED DF
names(gwas.data) <- paste0(names(gwas.data), "_A")

# CONVERT FACTORS TO CHARACTER
correct.orientation[, c("A1","A2")] <- sapply(correct.orientation[,c("A1","A2")],as.character)
# SUFFIXING COL NAMES TO IDENTIFY IN MERGED DF
names(correct.orientation) <- paste0(names(correct.orientation ), "_B")

# MERGE DATA FRAMES (ASSUMING SNP IS UNIQUE IDENTIFIER)
comparedf <- merge(gwas.data, correct.orientation, by.x="SNP_A", by.y="SNP_B", all=TRUE)

# CALCULATE NEW MAF
comparedf$MAF_A <- ifelse(((comparedf$A1_A == comparedf$A2_B) &
                           (comparedf$A2_B == comparedf$A1_A)), 
                          (1 - comparedf$MAF_A), 
                          comparedf$MAF_A)
comparedf$zscore_A <- ifelse(((comparedf$A1_A == comparedf$A2_B) &
                              (comparedf$A2_B == comparedf$A1_A)),   
                               -1 * comparedf$zscore_A, 
                               comparedf$zscore_A)

# SPLIT MERGE BACK TO ORIGINAL STRUCTURE
newgwas.data <- comparedf[,names(gwas.data)]
# REMOVE SUFFIX
names(newgwas.data) <- gsub("_A", "", names(newgwas.data))
# RESET FACTORS
newgwas.data$A1 <- as.factor(newgwas.data$A1)
newgwas.data$A2 <- as.factor(newgwas.data$A2)
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/33966298

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档