我试图将观察计数的标签放置在盒形胡须的末端,但当有异常值时,它似乎不起作用。
我试图将最大/分钟值与我认为是计算出的晶须长度四分位数1(或四分位数3) +(或-) 1.5 *四分位数范围进行比较。但标签既不被放置在最大值/分钟值,也不会放在晶须末端。
使用mtcars和y轴反向演示的示例:
library(ggplot2,dplyr)
mtcars %>%
select(qsec, cyl,am) %>%
ggplot(aes(factor(cyl),qsec,fill=factor(am))) +
stat_boxplot(geom = "errorbar") + ## Draw horizontal lines across ends of whiskers
geom_boxplot(outlier.shape=1, outlier.size=3,
position = position_dodge(width = 0.75)) +
scale_y_reverse() +
geom_text(data = mtcars %>%
select(qsec,cyl,am) %>%
group_by(cyl, am) %>%
summarize(min_qsec = min(qsec),Count = n(),med = median(qsec),
q1 = quantile(qsec,0.25),
q3 = quantile(qsec,0.75), iqr = IQR(qsec),
qsec = mean(qsec),
lab_pos = max(min_qsec, q1-1.5*iqr)),
aes(y=lab_pos,label = Count), position = position_dodge(width = 0.75))它产生:

am(1) at cyl(4)和am(0) at cyl(8)的标签不对齐。
我对lab_pos的计算是否不正确,或者是否有更好的方法在晶须末端定位标签,而不考虑异常值?如果可能的话,我想使用ggplot2和dplyr来完成它。
发布于 2017-12-10 17:52:42
如果我理解正确,这就是你想要的:
label_data <- mtcars %>%
select(qsec, cyl, am) %>%
group_by(cyl, am) %>%
summarize(min_qsec = min(qsec),
Count = n(),
med = median(qsec),
q1 = quantile(qsec, 0.25),
q3 = quantile(qsec, 0.75),
iqr = IQR(qsec),
lab_pos = min(ifelse(qsec > q1-1.5*iqr, qsec, NA), na.rm = TRUE),
qsec = mean(qsec))
mtcars %>%
select(qsec, cyl,am) %>%
ggplot(aes(factor(cyl),qsec,fill=factor(am))) +
stat_boxplot(geom = "errorbar") + ## Draw horizontal lines across ends of whiskers
geom_boxplot(outlier.shape=1, outlier.size=3,
position = position_dodge(width = 0.75)) +
scale_y_reverse() +
geom_text(data = label_data, aes(y = lab_pos,label = Count),
position = position_dodge(width = 0.75), vjust = 0, fontface = "bold")

胡须延伸到篱笆内最远的地方,而不是栅栏本身。
https://stackoverflow.com/questions/47740448
复制相似问题