我创建了一个名为getExpressionLevel的函数,这个问题要求我使用这个函数来用下面的语句替换数字。那么,我需要用什么来实现这一点呢?
getExpressionLevel功能;
function(a) {
if (a<5) {
cat ("none")
}
if (a>=5&a<20) {
cat ("low")
}
if (a>=20&a<60) {
cat ("medium")
}
if (a>=60) {
cat ("high")
}
}问题是;
创建一个名为data.frame的expression_levels,它有10行(每个基因1行)和3列(每个细胞系一列)。然后计算每个细胞株中每个基因的平均表达量,并使用getExpressionLevel函数标记相应的表达。
这是我现在的data.frame。它中的数据需要替换为getExpressionfunction的结果。
genename Kc167 BG3 S2
1 Clic 7.333333 48.33333 75.00000
2 Treh 24.666667 12.66667 52.33333
3 bib 31.333333 79.33333 82.00000
4 CalpC 65.000000 69.33333 63.66667
5 tud 59.666667 81.66667 16.33333
6 cort 74.333333 50.66667 28.66667
7 S2P 72.000000 39.66667 50.66667
8 Mitofilin 38.333333 29.00000 54.66667
9 Oxp 73.666667 49.33333 42.66667
10 Ada1-2 87.333333 42.00000 28.00000这是预期的data.frame:
Kc167 BG3 S2
Clic low medium high
Treh medium low medium
bib medium high high
CalpC high high high
tud medium high low
cort high medium medium
S2P high medium medium
MitofiliN medium medium medium
Oxp high medium medium
Ada1-2 high medium medium发布于 2018-02-25 18:56:45
功能的方式。知道如何使用函数总是有帮助的。
## sample data
df <- data.table(genename = c('Clic','Treh','bib','CalpC'),
Kc167 = c(7.333,24.666,31.3333,65),
BG3 = c(48.33,12.66,79.33,69.33),
S2 = c(75.00,52.33,82.00,63.66))
## this function updates values based on following criterias
get_values <- function(x)
{
if(x < 5) return ('None')
else if ((x >= 5) && (x < 20)) return ('low')
else if ((x >= 20) && (x < 60)) return ('medium')
else if (x >= 60) return ('high')
}
## creating a new data frame with answers
df2 <- df$genename
df2$Kc167 <- sapply(df$Kc167, get_values)
df2$BG3 <- sapply(df$BG3, get_values)
df2$S2 <- sapply(df$S2, get_values)
genename Kc167 BG3 S2
1: Clic low medium high
2: Treh medium low medium
3: bib medium high high
4: CalpC high high high发布于 2018-02-25 18:42:29
bin_breaks <- c(-Inf, 5, 20, 60, Inf)
bin_labels <- c("none", "low", "medium", "high")
df[,-1] <- sapply(df[,-1], function(x) cut(x,
breaks = bin_breaks,
labels = bin_labels,
right = F))
df产出如下:
genename Kc167 BG3 S2
1 Clic low medium high
2 Treh medium low medium
3 bib medium high high
4 CalpC high high high
5 tud medium high low
6 cort high medium medium
7 S2P high medium medium
8 Mitofilin medium medium medium
9 Oxp high medium medium
10 Ada1-2 high medium medium样本数据:
df <- structure(list(genename = c("Clic", "Treh", "bib", "CalpC", "tud",
"cort", "S2P", "Mitofilin", "Oxp", "Ada1-2"), Kc167 = c(7.333333,
24.666667, 31.333333, 65, 59.666667, 74.333333, 72, 38.333333,
73.666667, 87.333333), BG3 = c(48.33333, 12.66667, 79.33333,
69.33333, 81.66667, 50.66667, 39.66667, 29, 49.33333, 42), S2 = c(75,
52.33333, 82, 63.66667, 16.33333, 28.66667, 50.66667, 54.66667,
42.66667, 28)), .Names = c("genename", "Kc167", "BG3", "S2"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))
编辑:在代码中添加了适当的right参数,以满足边界条件和OP的要求(提供@drf)。
https://stackoverflow.com/questions/48975833
复制相似问题