首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何在r中过滤数据以实现数据可视化?

如何在r中过滤数据以实现数据可视化?
EN

Stack Overflow用户
提问于 2020-10-17 23:07:18
回答 1查看 30关注 0票数 0

我需要重新创建可视化,但过滤数据,以保持只有期限少于360个月的业务。

我使用的数据是来自以下链接的SBA数据:https://amstat.tandfonline.com/doi/full/10.1080/10691898.2018.1434342

代码语言:javascript
复制
library('magrittr')
library(dplyr)
library(tidyr)
library(ggplot2)
sba2 <- sba_data %>% 
  mutate(default_binary = ifelse(MIS_Status=="CHGOFF","Paid in Full","Default"), daysterm = Term*30, xx = as.Date(sba_data$DisbursementDate, format="%Y-%m-%d") + daysterm, recession_binary = ifelse(xx >= "2007-12-01" & xx <="2009-06-30","Active during Recession","Not Active during Recession"), smaller_business_binary = ifelse(NoEmp < 30, "Very Small Biz", "Not Very Small Biz"), business_length = ifelse(Term < 360, "Short Business", "Long Business"))

table(sba2$business_length)

sba_3 <-  sba2 %>%
  group_by(recession_binary, default_binary) %>%
  summarise(frequencies=n()) %>%
  drop_na() %>%
  mutate(percents = round(frequencies/sum(frequencies),2))

ggplot(data = sba_3 ) + 
  geom_col(mapping = aes(x = recession_binary, y = percents, fill = default_binary)) +
  coord_flip() +
  scale_fill_manual(breaks = c("Default", "Paid in Full"), 
                       values=c(rgb(232/255,74/255,39/255), rgb(19/255,41/255,75/255))) +
  scale_y_continuous(labels = scales::percent)

这是我到目前为止用来重建可视化的代码。然而,我不确定如何过滤数据,只保留期限少于360个月的业务。我在修改sba2时创建了变量business_length,但不确定下一步是什么。如有任何帮助,将不胜感激,谢谢!

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-10-17 23:17:04

像这样的东西?

代码语言:javascript
复制
sba_3 <-  sba2 %>%
  filter(Term < 360) %>% 
  group_by(recession_binary, default_binary) %>%
  summarise(frequencies=n()) %>%
  drop_na() %>%
  mutate(percents = round(frequencies/sum(frequencies),2))
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64403934

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档