搜索过,但还没有看到这是在哪里处理的,我有一个项目站点之间绝对差异的成对计算数据框架,数据如下所示
x y value
1 2 1 5
2 3 1 4
3 4 1 6
4 5 1 3
5 3 2 5
6 4 2 7
7 5 2 3
8 4 3 2
9 5 3 5
10 5 4 7其中x和y是配对点,值是差异。我想得到平均的结果,每个网站的分别显示。例如:所有站点5对的站点平均值(5,5,5,5,5,1,5)= 4.5,因此我的结果如下:
site avg
1 4.5
2 5
3 4
4 5.5
5 4.5谁有解决办法?
发布于 2018-06-02 11:43:19
使用dplyr和mapply的解决方案。
library(dplyr)
data.frame(site = unique(c(df$x, df$y))) %>%
mutate(mean = mapply(function(v)mean(df$value[df$x==v | df$y==v]), .$site)) %>%
arrange(site)
# site mean
# 1 1 4.5
# 2 2 5.0
# 3 3 4.0
# 4 4 5.5
# 5 5 4.5数据:
df <- read.table(text =
" x y value
1 2 1 5
2 3 1 4
3 4 1 6
4 5 1 3
5 3 2 5
6 4 2 7
7 5 2 3
8 4 3 2
9 5 3 5
10 5 4 7",
header = TRUE, stringsAsFactors = FALSE)发布于 2018-06-02 15:00:58
下面是使用tidyverse的另一个选项
library(tidyverse)
df %>%
select(x, y) %>%
unlist %>%
unique %>%
sort %>%
tibble(site = .) %>%
mutate(avg = map_dbl(site, ~
df %>%
filter_at(vars(x, y), any_vars(. == .x)) %>%
summarise(value = mean(value)) %>%
pull(value)))
# A tibble: 5 x 2
# site avg
# <int> <dbl>
#1 1 4.5
#2 2 5
#3 3 4
#4 4 5.5
#5 5 4.5数据
df <- structure(list(x = c(2L, 3L, 4L, 5L, 3L, 4L, 5L, 4L, 5L, 5L),
y = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L), value = c(5L,
4L, 6L, 3L, 5L, 7L, 3L, 2L, 5L, 7L)), .Names = c("x", "y",
"value"), class = "data.frame",
row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10"))发布于 2018-06-02 07:50:09
如果我们将原始数据示例命名为df
df$site_pair <- paste(df$x, df$y, sep = "-")
all_sites <- unique(c(df$x, df$y))
site_get_mean <- function(site_name) {
yes <- grepl(site_name, df$site_pair)
mean(df$value[yes])
}
df.new <- data.frame(site = all_sites,
avg = sapply(all_sites, site_get_mean))结果:(编辑后按网站名称排序)
> df.new[order(df.new$site), ]
site avg
5 1 4.5
1 2 5.0
2 3 4.0
3 4 5.5
4 5 4.5https://stackoverflow.com/questions/50654522
复制相似问题