我的数据集是Churn_Modeling:Churn_Modeling
我希望创建一个名为c_rating的列,其范围如下:(<500 -=“非常差”,500-600=“差”,601-660=“一般”,661- 780 =“好”,>= 780-“优秀”)。
library(tidyverse)
library(reticulate)
library(readxl)
library(modelr)
library(ggplot2)
library(dplyr)churn <- read.csv("Churn_Modeling.csv")churn$CreditScore <- as.numeric(churn$CreditScore)
class(churn$CreditScore)churn$c_rating <- cut(churn$CreditScore, c(-Inf, 500, 600, 601, 660, 661, 780, Inf),
levels=c('<=500', '500-600', '601-660', '661-780', '>780'))
churn$c_rating我的输出并没有像我想的那样创建一个列c_rating。有什么想法吗?
发布于 2020-12-12 05:28:19
使用mutate()和case_when()。
library(tidyverse)
churn <- read.csv("Churn_Modeling.csv")
churn<-churn %>% mutate(c_rating=case_when(CreditScore<500~"very poor",
CreditScore>=500 & CreditScore<=600~"poor",
CreditScore>=601 & CreditScore<=660~"fair",
CreditScore>=661 & CreditScore<=780~"good",
CreditScore> 780 ~ "excellent"))发布于 2020-12-12 05:42:52
Nicolas Ratto的回答非常好。另一种方法是首先创建一个用户定义的函数,然后使用lapply()。下面是一个例子。
churn <- read.csv("Churn_Modeling.csv")
churn$CreditScore <- as.numeric(churn$CreditScore)
C_Rating = function(score){
if (score < 500)
rating = "Very Poor"
else if (score >= 500 & score <= 600)
rating = "Poor"
else if (score >= 601 & score <= 660)
rating = "Fair"
else if(score >= 661 & score <= 780)
rating = "Good"
else
rating = "Excellent"
return(rating)
}
churn$c_rating = churn$CreditScore %>% lapply(C_Rating)https://stackoverflow.com/questions/65258687
复制相似问题