保持<- c(0.001,0.5,0.1)
df$a df$b df$c -基于低于第一阈值的电平频率
df$x df$y df$x -基于低于第二阈值的电平频率
df$ df$e df$f -基于低于第三个阈值的电平频率的f$f$f-
发布于 2022-04-01 15:24:44
有了安德烈亚斯的建议和进一步的阅读,我想出了下面的方法,效果很好。谢谢
agg_cats_thresholds <- c(0.01, 0.05, 0.005, 0.001)
agg_cats_thresholds <- as.data.frame(agg_cats_thresholds)
#create the lists of variables
factor_columns1 <- c("a", "b","c", "d", "e")
factor_columns2 <- c("f")
factor_columns3 <- c("g")
factor_columns4 <- c("h", "i", "j", "k")
# Use fct_lump_prop to reduce the levels of the various factor variables
churn.ml[factor_columns1] <- lapply(churn.ml[factor_columns1],
fct_lump_prop, prop = agg_cats_thresholds[1,]
,other_level = 'other')
churn.ml[factor_columns2] <- lapply(churn.ml[factor_columns2],
fct_lump_prop, prop =
agg_cats_thresholds[2,] ,other_level = 'other')
churn.ml[factor_columns3] <- lapply(churn.ml[factor_columns3],
fct_lump_prop, prop =
agg_cats_thresholds[3,] ,other_level = 'other')https://stackoverflow.com/questions/71693247
复制相似问题