我正在对goodreads_score和三个布尔自变量之间的关系进行探索性数据分析:fiction、best_seller和english。
set.seed(1)
N <- 100
p <- rep(0.5, N)
id <- c(1, N)
fiction <- factor(rbinom(length(p), 1, p))
best_seller <- factor(rbinom(length(p), 1, p))
english <- factor(rbinom(length(p), 1, p))
goodreads_score <- runif(100, 0, 5)
df <- data.frame(id, fiction, best_seller, english, goodreads_score)我知道如何绘制一个自变量的箱线图:
ggplot(df, aes(x=fiction, y=goodreads_score)) +
geom_boxplot(colour = "#3366FF", outlier.shape = NA) +
geom_jitter(position=position_jitter(width=.1, height=0.1))

我想知道我是否可以将所有三个标签放在一个图中(具有三个并排的组)?
发布于 2021-05-17 22:59:06
这可能会满足您的需求:
df2 <- tidyr::gather(df, key = category, value = value,
fiction:english, factor_key = TRUE )
ggplot(df2, aes(x=value, y=goodreads_score)) +
geom_boxplot(colour = "#3366FF", outlier.shape = NA) +
geom_jitter(position=position_jitter(width=.1, height=0.1)) +
facet_wrap(~category, scale="free")

另一方面,如果您正在寻找基于指标值的所有类别的分组,则可以执行以下操作:
df2 <- tidyr::gather(df, key = category, value = value,
fiction:english, factor_key = TRUE )
df2$cat_value <- paste0(df2$category,":",df2$value)
df2$cat_value <- factor(df2$cat_value ,
levels=c("fiction:0", "best_seller:0", "english:0",
"fiction:1", "best_seller:1", "english:1"))
ggplot(df2, aes(x=cat_value, y=goodreads_score)) +
geom_boxplot(colour = "#3366FF", outlier.shape = NA) +
geom_jitter(position=position_jitter(width=.1, height=0.1)) +
geom_vline(xintercept = 3.5, color = "red", linetype = "dashed", size = 1.4)

https://stackoverflow.com/questions/67555608
复制相似问题