我需要重新创建可视化,但过滤数据,以保持只有期限少于360个月的业务。

我使用的数据是来自以下链接的SBA数据:https://amstat.tandfonline.com/doi/full/10.1080/10691898.2018.1434342
library('magrittr')
library(dplyr)
library(tidyr)
library(ggplot2)
sba2 <- sba_data %>%
mutate(default_binary = ifelse(MIS_Status=="CHGOFF","Paid in Full","Default"), daysterm = Term*30, xx = as.Date(sba_data$DisbursementDate, format="%Y-%m-%d") + daysterm, recession_binary = ifelse(xx >= "2007-12-01" & xx <="2009-06-30","Active during Recession","Not Active during Recession"), smaller_business_binary = ifelse(NoEmp < 30, "Very Small Biz", "Not Very Small Biz"), business_length = ifelse(Term < 360, "Short Business", "Long Business"))
table(sba2$business_length)
sba_3 <- sba2 %>%
group_by(recession_binary, default_binary) %>%
summarise(frequencies=n()) %>%
drop_na() %>%
mutate(percents = round(frequencies/sum(frequencies),2))
ggplot(data = sba_3 ) +
geom_col(mapping = aes(x = recession_binary, y = percents, fill = default_binary)) +
coord_flip() +
scale_fill_manual(breaks = c("Default", "Paid in Full"),
values=c(rgb(232/255,74/255,39/255), rgb(19/255,41/255,75/255))) +
scale_y_continuous(labels = scales::percent)这是我到目前为止用来重建可视化的代码。然而,我不确定如何过滤数据,只保留期限少于360个月的业务。我在修改sba2时创建了变量business_length,但不确定下一步是什么。如有任何帮助,将不胜感激,谢谢!
发布于 2020-10-17 23:17:04
像这样的东西?
sba_3 <- sba2 %>%
filter(Term < 360) %>%
group_by(recession_binary, default_binary) %>%
summarise(frequencies=n()) %>%
drop_na() %>%
mutate(percents = round(frequencies/sum(frequencies),2))https://stackoverflow.com/questions/64403934
复制相似问题