如果数据具有相同的" ID“、”日期“和"Action=6”,我希望创建一个名为购买ID的新列,该列的滚动号从1到行的末尾。
因此,对于具有ID=P40、Date = 26072013和Action = 6的行,将为这些行分配一个具有购买ID 1的新列。接下来,对于带有ID=P42的行,Date = 01072014和Action = 6,购买ID列中的值将一直滚动到2。那会很有帮助的!!太感谢了!!
library(dplyr)
df <- data.frame(list(ID = c("P40", "P40", "P40", "P40", "P42", "P42"),
Date = dmy(c(26072013, 26072013, 2092012, 23082012, 01072014, 01072014))),
Action = c("1", "6", "1", "1", "6", "1"))这是我想要处理的代码,但很明显,它不起作用..
PurchaseID <- c()
for (row in df){
if (length(unique(df$ID))==1 & length(unique(df$date))==1) {
df %>% mutate(PurchaseID = seq(100,by = 1,length.out = nrow(df)))
} else {
PurchaseID <- c(PurchaseID, "NA")
}
}}UPDATE!!:谢谢您的评论,下面是所需的输出。我正在处理200K数据,所以这只是一个摘录。我希望创建一个具有滚动值的新列,该列与
希望听起来更清楚!太感谢了!
library(dplyr)
library(lubridate)
df <- data.frame(list(ID = c("P40", "P40", "P40", "P40", "P42", "P42"),
Date = dmy(c(26072013, 26072013, 2092012, 23082012, 01072014, 01072014)),
Action = c("1", "6", "1", "1", "6", "2"),
PurchaseID = c("NA", "001", "NA", "NA", "002", "NA") ))期望输出:

发布于 2022-06-03 13:51:50
你可以用data.table::rleid
library(data.table)
setDT(df)[Action==6, PurchaseID:=rleid(ID,Date)][]输出:
ID Date Action PurchaseID
1: P40 2013-07-26 1 NA
2: P40 2013-07-26 6 1
3: P40 2012-09-02 1 NA
4: P40 2012-08-23 1 NA
5: P42 2014-07-01 6 2
6: P42 2014-07-01 1 NA或者/慢得多(也许tidyverse巫师有更好的方法)
bind_rows(
filter(df, Action!=6),
filter(df, Action==6) %>%
mutate(PurchaseID=data.table::rleid(ID,Date))
) %>% arrange(ID,Date,Action)输出:
ID Date Action PurchaseID
1 P40 2012-08-23 1 NA
2 P40 2012-09-02 1 NA
3 P40 2013-07-26 1 NA
4 P40 2013-07-26 6 1
5 P42 2014-07-01 1 NA
6 P42 2014-07-01 6 2https://stackoverflow.com/questions/72490159
复制相似问题