我试着运行下面的代码,给我两个柱状图并排比较从未结过婚和结过婚的人之间的工资差异。代码确实可以工作,但我被告知这是获得答案的错误方式。我不确定如何根据婚姻状况过滤原始工资数据集,以便在不创建新数据集的情况下仅包括结婚和从未结婚。
#Create a histogram to summarize the wage column ONLY for Married people
Married <- subset(Wage,Wage$maritl=="2. Married")
#Using facets create two side by side wage histograms for married vs never married people
Never_Married <- subset(Wage,Wage$maritl=="1. Never Married")
hist(Never_Married$wage)
Married_NeverMarried <- full_join(Married,Never_Married)
ggplot(Married_NeverMarried,aes(x=wage))+
geom_histogram(binwidth=20,color="black",fill="white")+
facet_grid(.~maritl)下面的代码似乎使用原始的工资数据集工作,但给了我所有的婚姻状态,而不是只有2。
ggplot(Wage,aes(x=wage))+
geom_histogram()+
facet_grid(.~ maritl)谢谢你!!
发布于 2020-05-22 03:35:48
我们可以使用%in%来设置行的子集,然后使用OP的post中的原始代码进行绘图
library(ISLR)
data(Wage)
library(ggplot2)
library(dplyr)
Wage %>%
select(wage, maritl) %>%
filter(maritl %in% c("2. Married", "1. Never Married")) %>%
ggplot(aes(x = wage)) +
geom_histogram(binwidth=20,color="black",fill="white")+
facet_grid(.~maritl)-output

发布于 2020-05-22 03:44:49
一种方法是在对ggplot的调用中设置Wage子集
library(ISLR)
data(Wage)
library(ggplot2)
ggplot(Wage[grepl("1|2",Wage$maritl),] ,aes(x=wage)) +
geom_histogram() +
facet_grid(.~ maritl)

另一个"tidyverse“选项是在filter中使用dplyr
library(dplyr)
Wage %>%
dplyr::filter(str_detect(maritl,"1|2")) %>%
ggplot(aes(x=wage)) +
geom_histogram() +
facet_grid(.~ maritl)这会产生一个完全相同的图形。
https://stackoverflow.com/questions/61942448
复制相似问题