我是在两个连续变量之间做相关图,在ggpubr包中使用ggpubr。我使用的是kendall秩系数,p值自动添加到图中。我想使用scale_y_log10(),因为其中一个度量的范围很大。但是,将scale_y_log10()添加到代码中会影响p值。
Sampledata:
sampledata <- structure(list(ID = c(1, 2, 3, 4, 5), Measure1 = c(10, 10, 50, 0, 100), Measure2 = c(5, 3, 40, 30, 20), timepoint = c(1, 1,1, 1, 1), time = structure(c(18628, 19205, 19236, 19205, 19205), class = "Date"), event = c(1, 1, NA, NA, NA), eventdate = structure(c(18779,19024, NA, NA, NA), class = "Date")), row.names = c(NA, -5L), class = "data.frame")没有scale_y_log10()的图
ggscatter(data = sampledata, x = "Measure2", y = "Measure1",
add = "reg.line", conf.int = TRUE,
cor.coef = TRUE, cor.method = "kendall",
xlab = "measure2", ylab = "measure1", color="#0073C2FF" ) 如你所见,R=0.11,P=0.8
添加scale_y_log10()
ggscatter(data = sampledata, x = "Measure2", y = "Measure1",
add = "reg.line", conf.int = TRUE,
cor.coef = TRUE, cor.method = "kendall",
xlab = "measure2", ylab = "measure1", color="#0073C2FF" ) + scale_y_log10()R=0.55和P=0.28.
这只是一些样本数据,而不是我的实际数据。
有人能帮我弄清楚吗?
发布于 2022-10-22 20:11:41
您的p值更改的原因是您的y值之一(变量Measure2为0 )。执行日志转换时,此0值变为负无穷大。它不能显示在绘图上,因此从绘图数据中删除。如果在没有此数据点的情况下运行ggscatter,您将看到得到与日志转换相同的值:
ggscatter(data = subset(sampledata, Measure1 > 0),
x = "Measure2", y = "Measure1",
add = "reg.line", conf.int = TRUE,
cor.coef = TRUE, cor.method = "kendall",
xlab = "measure2", ylab = "measure1", color="#0073C2FF" )

您还可以看到,置信区间的y值扩展到0以下,因此,日志转换图中的置信区间与未转换图中的置信区间不同-- geom_smooth层基本上是在对日志转换数据进行线性回归,这可能不是您想要的。
与许多使创建简单的绘图更容易的ggplot扩展一样,人们会发现,如果您想要做一些不寻常的事情(例如在添加日志刻度时不包括0或负值),您就不能在该框架内这样做,因此您需要返回到香草样地来实现您想要的结果。
例如,您可以创建点、线和带状,但不包括0或负值,如下所示:
mod <- lm(Measure1 ~ Measure2, data = sampledata)
xvals <- seq(3, 40, length.out = 100)
xvals <- c(xvals, rev(xvals))
preds <- predict(mod, newdata = data.frame(Measure2 = xvals), se.fit = TRUE)
lower <- preds$fit - 1.96 * preds$se.fit
upper <- preds$fit + 1.96 * preds$se.fit
lower[lower < 1] <- 1
pred_df <- data.frame(Measure2 = xvals,
Measure1 = preds$fit)
polygon <- data.frame(Measure2 = xvals,
Measure1 = c(lower[1:100], upper[101:200]))
ct <- cor.test(sampledata$Measure2, sampledata$Measure1, method = "kendall")现在,我们可以安全地绘制数据,并将其样式设置为ggscatter:
p <- ggplot(subset(sampledata, Measure1 > 0),
aes(Measure2, Measure1)) +
geom_polygon(data = polygon, fill = "#0073c2", alpha = 0.5) +
geom_point(color = "#0073c2", size = 2) +
geom_line(data = pred_df, color = "#0073c2", size = 1) +
annotate("text", hjust = 0, x = min(sampledata$Measure2), y = 50, size = 5,
label = paste0("R = ", sprintf("%1.2f", ct$estimate), ", p = ",
sprintf("%1.2f", ct$p.value))) +
theme_classic(base_size = 16)
p

除了现在我们可以安全地日志转换输出:
p + scale_y_log10(limits = c(1, 1000))

https://stackoverflow.com/questions/74166315
复制相似问题