我正在试着写一个基于条件计算观察值的代码。我不知道这是否可能。我想要实现的是在组中只计算一个观察值,而不是将它们加在一起。
这是数据帧:
df <- structure(list(ID = c("P40", "P40", "P40", "P40", "P42", "P42",
"P43", "P43", "P43"), Year = c("2013", "2013", "2014", "2015", "2013", "2014", "2014", "2014", "2014"),
Meeting = c("Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes")),
class = "data.frame", row.names = c(NA, -9L))
ID Year Meeting
P40 2013 Yes
P40 2013 Yes
P40 2014 Yes
P40 2015 Yes
P42 2013 Yes
P42 2014 Yes
P43 2014 Yes
P43 2014 Yes
P43 2014 Yes我想要实现的结果是:
ID Year Count
P40 2013 1
P40 2014 1
P40 2015 1
P42 2013 1
P42 2014 1
P43 2014 1这就是我到目前为止的代码,它只计算所有的观察值。
df %>% group_by(ID, Year) %>% summarise(Count = n())发布于 2019-07-25 22:45:29
你想要的是:
count(df %>% distinct(ID, Year), ID, Year, name = 'Count')输出:
# A tibble: 6 x 3
ID Year Count
<chr> <chr> <int>
1 P40 2013 1
2 P40 2014 1
3 P40 2015 1
4 P42 2013 1
5 P42 2014 1
6 P43 2014 1发布于 2019-07-25 22:46:04
我们可以在数据集上执行distinct,然后使用count
library(dplyr)
df %>%
distinct %>%
count(ID, Year)
# A tibble: 6 x 3
# ID Year n
# <chr> <chr> <int>
#1 P40 2013 1
#2 P40 2014 1
#3 P40 2015 1
#4 P42 2013 1
#5 P42 2014 1
#6 P43 2014 1或者使用data.table
library(data.table)
unique(setDT(df[1:2]))[, .N, .(ID, Year)]或者使用base R
subset(as.data.frame(table(unique(df[1:2]))), Freq != 0)或cbind的一个选项
cbind(unique(df[1:2]), n = 1)发布于 2019-07-25 23:05:23
既然你只想在每组中有一个观察结果,这不是
transform(unique(df), count = 1)
# ID Year Meeting count
#1 P40 2013 Yes 1
#3 P40 2014 Yes 1
#4 P40 2015 Yes 1
#5 P42 2013 Yes 1
#6 P42 2014 Yes 1
#7 P43 2014 Yes 1或者,如果只想对选定的列进行检查
transform(unique(df[1:2]), count = 1)https://stackoverflow.com/questions/57204691
复制相似问题