首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何使用R高效创建5x3条形图

如何使用R高效创建5x3条形图
EN

Stack Overflow用户
提问于 2016-04-24 22:27:58
回答 1查看 53关注 0票数 1

我有以下数据:

代码语言:javascript
复制
Project Topic    C10    C14     C03     C11     C16     C08
P1      T1      0.24    0.00    0.00    0.04    0.04    0.00
P1      T2      0.00    0.30    0.00    0.00    0.00    0.00
P1      T3      0.04    0.04    0.00    0.24    0.00    0.00
P1      T4      0.00    0.00    0.00    0.04    0.33    0.04
P1      T5      0.00    0.09    0.21    0.00    0.00    0.00
P1      T6      0.00    0.09    0.00    0.00    0.00    0.34

P2      T1      0.20    0.00    0.00    0.04    0.00    0.04
P2      T2      0.00    0.22    0.04    0.00    0.00    0.00
P2      T3      0.04    0.00    0.00    0.24    0.00    0.00
P2      T4      0.00    0.00    0.04    0.00    0.33    0.00
P2      T5      0.04    0.00    0.21    0.00    0.00    0.00
P2      T6      0.00    0.04    0.00    0.00    0.00    0.34

P3      T1      0.20    0.00    0.00    0.00    0.08    0.00
P3      T2      0.00    0.17    0.00    0.00    0.00    0.00
P3      T3      0.00    0.00    0.00    0.08    0.00    0.00
P3      T4      0.00    0.04    0.00    0.04    0.24    0.00
P3      T5      0.00    0.00    0.21    0.00    0.00    0.04
P3      T6      0.00    0.09    0.00    0.00    0.00    0.22
    ......

我想做的是将上面的数据创建到下面的图中:

在这个草图中,条形的高度属于C#s的值,它应该有六种颜色。每个柱状图都属于P#s数据集。

我尝试使用以下代码,将每个P#s数据集复制到.csv文件中,并使用par(mfrow=c(5,3))将其绘制在相同的绘图框架中:

代码语言:javascript
复制
library(e1071)
topics <- read.csv("P1.csv", head=TRUE)
dput(head(topics))
pdf("cosinesimilarityplots.pdf", family="Times")
par(mfrow=c(5,3))
colours <- c("red", "orange", "yellow", "green","blue"," black")
barplot(as.matrix(topics), main="Project Name", ylab="", cex.lab = 1.5, cex.main = 1.4, beside=TRUE, col=colours,ylim=c(0, 0.5))
title(ylab=expression(paste("Cose(", theta, ")")),xlab="Seeded-LDA topics", line=2, cex.lab=1.2)
legend("topleft", c("C10: Resource Management (RM)","C14:Cross Site Scripting (XSS)","C03:Authentication Abuse (AA)","C11:Buffer Overflow (BoF)","C16:Access  Privileges (AP)","C08:SQL  Injection (SI)"), cex=0.85, bty="n", fill=colours)
dev.off()

dput(head(topics))的结果如下:

代码语言:javascript
复制
structure(list(T1 = c(0.24, 0, 0, 0.04, 0.04, 0), T2 = c(0.24, 
0.3, 0, 0, 0, 0), T3 = c(0.04, 0.04, 0, 0.24, 0, 0), T4 = c(0, 
0, 0, 0.04, 0.33, 0.04), T5 = c(0, 0.09, 0.21, 0, 0, 0), T6 = c(0, 
0.09, 0, 0, 0, 0.34)), .Names = c("T1", "T2", "T3", "T4", "T5", 
"T6"), row.names = c(NA, 6L), class = "data.frame")

然后,我意识到条形图的质量变得非常低,并且在单独的.csv文件中绘制每个P#s数据将永远花费很长时间,特别是当P#s的数量大于15的时候。

如何在不将主数据集文件拆分成更小的文件的情况下有效地绘制它?最好使用R

EN

回答 1

Stack Overflow用户

发布于 2017-02-21 04:20:52

借助dplyrreshape2的一些帮助,您可以使用ggplot2gridextra创建类似于您所绘制的图。就像通常的情况一样,仅仅因为你有能力在R中做一些事情并不意味着它是直观的。基本上,您必须为每个项目创建一个单独的绘图对象,剥离图例,然后使用grid.arrange()重新组合所有内容。

代码语言:javascript
复制
library(tidyverse) # ggplot2, dplyr, etc
library(reshape2)  # Outdated but still works
library(gridExtra) # Allows us to put plots into grids

# Generate some dummy data
data <- tibble(
  Project =   rep(paste0("P", 1:6), length = 30),
  C10 = abs(rnorm(30)),
  C14 = runif(30),
  C03 = sample(1:30) / 50,
  C11 = rnorm(30) ^ 2,
  C16 = abs(rnorm(30) / 2),
  C08 = abs(rnorm(30) * 2)
)

data <- data %>%
  arrange(Project) %>%
  mutate(Topic = rep(paste0("T", 1:5), length = 30))

# Melt the data from wide to long format
data <- melt(data, id.vars = c("Project", "Topic"))

#########################################################
# Now you can actually create the chart
#########################################################

# Use a function to create a version of the plot for each Project
plot_proj <- function(projnum) {
  filter(data, Project == projnum) %>%
    rename(Legend = variable) %>%
    ggplot(., aes(x = Topic, y = value, fill = Legend)) +
    geom_bar(stat = "identity", position = "dodge") +
    labs(x = "", y = "", title = projnum) +
    theme_bw() +
    theme(plot.title = element_text(hjust = 0.5),
          panel.border = element_blank())
}

# Create a separate plot for each Project
plots <- map(unique(data$Project), plot_proj)

# This function was borrowed from an older StackOverflow answer
# Source: http://stackoverflow.com/questions/13649473/add-a-common-legend-for-combined-ggplots
g_legend <- function(a.gplot) {
  tmp <- ggplot_gtable(ggplot_build(a.gplot))
  leg <- which(sapply(tmp$grobs, function(x)
    x$name) == "guide-box")
  legend <- tmp$grobs[[leg]]
  return(legend)
}

mylegend <- g_legend(plots[[1]])

# Combine the plots and add one
grid.arrange(
  arrangeGrob(
    plots[[1]] + theme(legend.position = "none"),
    plots[[2]] + theme(legend.position = "none"),
    plots[[3]] + theme(legend.position = "none"),
    plots[[4]] + theme(legend.position = "none"),
    plots[[5]] + theme(legend.position = "none"),
    plots[[6]] + theme(legend.position = "none"),
    left = mylegend
  )
)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/36824392

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档