文章/答案/技术大牛

发布

社区首页 >问答首页 >将向量用作公式中的变量

问将向量用作公式中的变量
EN

Stack Overflow用户

提问于 2017-07-01 16:07:51

回答 1查看 47关注 0票数 0

我有一个作用于多个列的函数，但我希望对它进行调整，以便为每一列使用不同的主变量mode的值。我在下面放了一个简化的例子。

我的数据是频率的交叉表，即在列A01中有6485个计数为13个CAG，35个计数为14个CAG等。因此，列1的模态值为13。

我需要计算一下：

1)使用(均值模式)/sd的偏斜度

2) CAG >的每一列的比例

下面的代码可以解决这个问题。

然而，我现在需要将每个样本与对照样本的模式进行比较，并且我有点卡在代码上了。在表controls中定义了需要对每个样本进行比较的样本。

我可以请求帮助调整我的代码，以便使用每个列的适当控制模式来计算skewmode和prop吗？

我希望这是有意义的！

#Data set
data <- data.frame(CAG = c(13, 14, 15, 17), 
                   A01 = c(6485,35,132, 12), 
                   A02 = c(0,42,56, 4))

#Mode
mode <- data[sapply(data[2:ncol(data)], which.max), ]$CAG 

#Summary statistics
sumstats <- sapply(data[, 2:ncol(data)], function(x) {
  data_e <- rep(data$CAG, x)
  library(psych)
  data.frame(
    describe(data_e)
  )
})

sumstats <- as.data.frame(t(sumstats))

sumstats[] <- lapply(sumstats, function(x) {
  as.numeric(x)
})

# Results table
results <- data.frame(mode, sumstats)

# Skewness - I'd like to replace 'results$mode' here 
# with the relevant mode from the controls table
skewmode <- (results$mean - results$mode) / results$sd

# Proportion > mode I'd like to replace 'mod' here 
# with the relevant mode from the controls table
prop <- lapply(data[, 2:ncol(data)], function(x) {
  mod <- data$CAG[which.max(x)]
  B <- sum(x[data$CAG >= mod])
  A <- sum(x[data$CAG <= mod])
  B/(A+B)
})

prop <- as.data.frame(prop)
prop <- t(prop)

results <- data.frame(mode, sumstats, skewmode, prop)

# Controls
ctrls <- data.frame(samples = c('A01', 'A02', 'A03', 'A04'), 
                    ctrl = c('A01','A01', 'A03', 'A03'))

回答 1

Stack Overflow用户

发布于 2017-07-02 23:40:17

考虑Map (mapply的包装器)，它迭代地将采样模式和控制模式传递到定义的函数prop_skew_calc()中，以计算skewmode和prop。最后，输出最后行绑定的数据帧列表。

注意:下面演示了基数R的summary()，因为我没有psyche包。但是，我在代码中留下了关于如何集成psych::describe()注释，docs指示它返回对心理测量学有用的汇总统计数据帧：

Data (添加A03和A04)

#Data set
data <- data.frame(CAG = c(13, 14, 15, 17), 
                   A01 = c(6485,35,132, 12), 
                   A02 = c(0,42,56, 4),
                   A03 = c(33,5014,2221, 18),
                   A04 = c(106,89,436, 11))

#Controls
ctrls <- data.frame(samples = c('A01', 'A02', 'A03', 'A04'), 
                    ctrl = c('A01','A01', 'A03', 'A03'))

函数(删除所有l/sapply循环，因为标量值将由Map迭代传递)

library(psych)

prop_skew_calc <- function(x, y) {

  #Mode
  samplemode <- data$CAG[which.max(data[[x]])] 
  cntrlmode <- data$CAG[which.max(data[[y]])]

  #Summary statistics
  sumstats <- summary(rep(data$CAG, data[[x]]))     # R base's summary()
  sumstats <- as.data.frame(t(unclass(sumstats)))   

  #sumstats <- describe(rep(data$CAG, data[[x]]))   # pysche's describe()
  #sumstats <- as.data.frame(t(sumstats))           

  # Results table
  results <- data.frame(cntrlmode, sumstats)

  # Skewness
  skewmode <- (results$Mean - results$cntrlmode) / results$Min

  # Proportion
  B <- sum(data[data$CAG >= cntrlmode, x])
  A <- sum(data[data$CAG <= cntrlmode, x])
  prop <- B/(A+B)

  results <- data.frame(samplemode, cntrlmode, sumstats, skewmode, prop=prop)
}

ctrl Map(调用上述函数，传递ctrl dataframe的列)

dfList <- Map(prop_skew_calc, ctrls$samples, ctrls$ctrl)

finaldf <- do.call(rbind, dfList)
finaldf
#   samplemode cntrlmode Min. X1st.Qu. Median  Mean X3rd.Qu. Max.   skewmode        prop
# 1         17        17   13       14     15 14.90       17   17 -0.1615385 0.223684211
# 2         13        17   13       13     13 13.05       13   17 -0.3038462 0.001797484
# 3         15        13   14       14     15 14.67       15   17  0.1192857 1.000000000
# 4         14        13   13       14     14 14.31       15   17  0.1007692 0.995491187

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/44858771

复制

相似问题

问将向量用作公式中的变量
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将向量用作公式中的变量EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将向量用作公式中的变量
EN