我有一个包含一个ID列和多个包含密度测量值的数值列的数据框。为了使密度正态分布,我需要取对数,但因为我的密度值为0,所以我需要将所有密度测量值增加0.5,以便在对数变换时不会得到Inf数据点。我如何使用dplyr做到这一点呢?
示例数据:
ID `Image Tag` `CD3 Global Den… `CD8 Global Den… `CD20 Global De… `CD3 Tumour Den… `CD8 Tumour Den…
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 IM_10 NA 608. 755. 51.0 868. 1066.
2 IM_1… NA 27.5 69.3 0.550 30.4 75.2
3 IM_1… NA 19.6 17.0 1.03 53.2 42.0
4 IM_1… NA 109. 89.0 47.7 725. 594.
5 IM_1… NA 219. 171. 0.501 531. 416.
6 IM_1… NA 4.00 0 0 5.94 0 我试着用
df1 <- df %>% group_by(ID) %>%
summarise_all(funs(mean(., na.rm=TRUE))) %>%
mutate_at(which(sapply(., is.numeric)), funs(sum(0.5)))但这会将我的所有数值列替换为0.5,而不是在原始密度上添加0.5。
ID `Image Tag` `CD3 Global Den… `CD8 Global Den… `CD20 Global De… `CD3 Tumour Den… `CD8 Tumour Den…
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 IM_10 0.5 0.5 0.5 0.5 0.5 0.5
2 IM_1… 0.5 0.5 0.5 0.5 0.5 0.5
3 IM_1… 0.5 0.5 0.5 0.5 0.5 0.5
4 IM_1… 0.5 0.5 0.5 0.5 0.5 0.5
5 IM_1… 0.5 0.5 0.5 0.5 0.5 0.5
6 IM_1… 0.5 0.5 0.5 0.5 0.5 0.5你知道该怎么做吗?
发布于 2019-04-12 19:41:40
我假设您想要汇总每个ID,然后将每个值加上0.5 (这不是NA)。那我就是这么做的:
# Sample data
df <- structure(list(ID = c("IM_10", "IM_11", "IM_12", "IM_13", "IM_14",
"IM_15"), Image_Tag = c(NA, NA, NA, NA, NA, NA), CD3_Global_Den = c(608,
27.5, 19.6, 109, 219, 4), CD8_Global_Den = c(755, 69.3, 17, 89,
171, 0), CD20_Global_De = c(51, 0.55, 1.03, 47.7, 0.501, 0),
CD3_Tumour_Den = c(868, 30.4, 53.2, 725, 531, 5.94), CD8_Tumour_Den = c(1066,
75.2, 42, 594, 416, 0)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"), .Names = c("ID", "Image_Tag", "CD3_Global_Den",
"CD8_Global_Den", "CD20_Global_De", "CD3_Tumour_Den", "CD8_Tumour_Den"
))
# Suggested code
library(hablar)
library(dplyr)
options(pillar.sigfig = 6)
df %>% group_by(ID) %>%
summarise_all(~mean_(.)) %>%
mutate_at(vars(-ID), ~. + 0.5)这给出了结果:
# A tibble: 6 x 7
ID Image_Tag CD3_Global_Den CD8_Global_Den CD20_Global_De CD3_Tumour_Den CD8_Tumour_Den
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 IM_10 NA 608.5 755.5 51.5 868.5 1066.5
2 IM_11 NA 28 69.8 1.05 30.9 75.7
3 IM_12 NA 20.1 17.5 1.53 53.7 42.5
4 IM_13 NA 109.5 89.5 48.2 725.5 594.5
5 IM_14 NA 219.5 171.5 1.00100 531.5 416.5
6 IM_15 NA 4.5 0.5 0.5 6.44 0.5发布于 2021-07-27 01:55:01
如果您只想添加一个df%>% map_if(is.numeric, ~.+1)
https://stackoverflow.com/questions/55639511
复制相似问题