我正在试图总结一个数据文件,以创建两个摘要:
QUOT或QUOG的订单数量QUOT或QUOG出现的订单数量,以及出现其他Holds的位置。下面是代码的开始:
library(dplyr)
dat <- data.frame(Order = c(123,123,123,145,145,189,210,210,123,123,164),
Location = c("Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Charlotte","Charlotte","Charlotte"),
Hold = c("QUOT","ENGR","VEND","QUOG","ENGR","QUOT","ENGR","VEND","QUOT","CUST","QUOT")
)
test <- dat %>%
group_by(Order, Location) %>%
.....我一直试图找出一个特定的订单是否只有QUOT或QUOG,然后它是否有QUOT或QUOG等。
预期产出:
Location Only Multiple
1 Chicago 1 2
2 Charlotte 1 1因此,对于预期产出:
QUOT和另一个控股(ENGR & VEND),因此这将被视为芝加哥的multipleQUOG和另一个控股(ENGR),因此这将被视为芝加哥的multipleQUOT在里面,没有其他的搁置,所以这将被认为是芝加哥唯一的。QUOT也没有QUOG,所以这个命令使在计数中排除了QUOT和另一个控股(CUST),所以这将被认为是夏洛特的多重QUOT在里面,没有其他的搁置,所以对于夏洛特来说,这将被视为唯一的。发布于 2020-02-25 22:15:38
下面是使用dplyr和tidyr的另一种解决方案。这一次,首先进行旋转,然后进行过滤和总结,以获得您的解决方案。
library(dplyr)
library(tidyr)
dat.summary <- dat %>%
mutate(hold_count = 1) %>%
pivot_wider(names_from = Hold, values_from = hold_count) %>%
mutate(only = if_else((QUOT == 1 | QUOG == 1) & is.na(ENGR) & is.na(VEND) & is.na(CUST), 1, 0),
multiple = if_else((QUOT == 1 | QUOG == 1) & (ENGR == 1 | VEND == 1 | CUST ==1), 1, 0)) %>%
group_by(Location) %>%
summarise(only = sum(only, na.rm = T), multiple = sum(multiple, na.rm = T))
dat.summary给你:
# A tibble: 2 x 3
Location only multiple
<fct> <dbl> <dbl>
1 Charlotte 1 1
2 Chicago 1 2数据
dat <- data.frame(
Order = c(123,123,123,145,145,189,210,210,123,123,164),
Location = c("Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Charlotte","Charlotte","Charlotte"),
Hold = c("QUOT","ENGR","VEND","QUOG","ENGR","QUOT","ENGR","VEND","QUOT","CUST","QUOT")
)https://stackoverflow.com/questions/60382037
复制相似问题