我有一个像这样的数据集
temp <- structure(list(Date = c("23/06/2002", "24/06/2002", "25/06/2002",
"25/06/2002", "26/06/2002", "02/07/2002", "03/07/2002", "24/07/2002",
"15/07/2002", "17/07/2002", "22/07/2002"), payment = c(-1000,
1000, -1000, -1000, 1000, -1000, -1000, -1000, 1200, 1200, 200
), Code = c("M567", "M567", "M567", "M567", "XYZ", "M567",
"ABX", "M567", "M567", "M567", "M300"), ID = c("187", "98",
"187", "187", "12ee", NA, NA, NA, "111", "111", "11")), class = c("data.table",
"data.frame"), row.names = c(NA, -11L), groups = structure(list(
assignment = c("ABX", "M300", "M567", "XYZ"), .rows = structure(list(
7L, 11L, c(1L, 2L, 3L, 4L, 6L, 8L, 9L, 10L), 5L), ptype = integer(0), class =
c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -4L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), .internal.selfref = <pointer: 0x0000020274961ef0>)输入:
Date payment Code ID
1: 23/06/2002 -1000 M567 187
2: 24/06/2002 1000 M567 98
3: 25/06/2002 -1000 M567 187
4: 25/06/2002 -1000 M567 187
5: 26/06/2002 1000 XYZ 12ee
6: 02/07/2002 -1000 M567 <NA>
7: 03/07/2002 -1000 ABX <NA>
8: 24/07/2002 -1000 M567 <NA>
9: 15/07/2002 1200 M567 111
10: 17/07/2002 1200 M567 111
11: 22/07/2002 200 M300 11我想删除给sum = 0用于相同的支付和代码列的行(不需要考虑ID列)
例如,: 1000和M567应该与任意-1000取消,而M567剩下的行应该是相同的。
中的+ve和-ve对
预期产出:
Date payment Code ID
1: 25/06/2002 -1000 M567 187
2: 25/06/2002 -1000 M567 187
3: 26/06/2002 1000 XYZ 12ee
4: 02/07/2002 -1000 M567 <NA>
5: 03/07/2002 -1000 ABX <NA>
6: 24/07/2002 -1000 M567 <NA>
7: 15/07/2002 1200 M567 111
8: 17/07/2002 1200 M567 111
9: 22/07/2002 200 M300 11发布于 2022-10-17 10:38:59
以下是一种可能的方法:
library(dplyr)
temp %>%
group_by(group = as.integer(gl(n(), 2,n()))) %>%
mutate(x = sum(payment)) %>%
filter(!(x == 0 & first(Code) == last(Code))) %>%
ungroup() %>%
select(-x, -group) Date payment Code ID
<chr> <dbl> <chr> <chr>
1 25/06/2002 -1000 M567 187
2 25/06/2002 -1000 M567 187
3 26/06/2002 1000 XYZ 12ee
4 02/07/2002 -1000 M567 NA
5 03/07/2002 -1000 ABX NA
6 24/07/2002 -1000 M567 NA
7 15/07/2002 1200 M567 111
8 17/07/2002 1200 M567 111
9 22/07/2002 200 M300 11 发布于 2022-10-17 11:54:04
您可以计算出付款之和为零的代码,将它们放在向量中,并从包含此代码的原始data.frame中删除所有行:
totalpay = aggregate(temp$payment ~ temp$Code, FUN=sum)
zeropay = totalpay[totalpay[,2]==0,1]
temp = temp[!temp$Code %in% zeropay,]更重要的是,您可以保留向量zeropay作为删除代码的文档。
https://stackoverflow.com/questions/74095492
复制相似问题