我有以下数据:
Project Topic C10 C14 C03 C11 C16 C08
P1 T1 0.24 0.00 0.00 0.04 0.04 0.00
P1 T2 0.00 0.30 0.00 0.00 0.00 0.00
P1 T3 0.04 0.04 0.00 0.24 0.00 0.00
P1 T4 0.00 0.00 0.00 0.04 0.33 0.04
P1 T5 0.00 0.09 0.21 0.00 0.00 0.00
P1 T6 0.00 0.09 0.00 0.00 0.00 0.34
P2 T1 0.20 0.00 0.00 0.04 0.00 0.04
P2 T2 0.00 0.22 0.04 0.00 0.00 0.00
P2 T3 0.04 0.00 0.00 0.24 0.00 0.00
P2 T4 0.00 0.00 0.04 0.00 0.33 0.00
P2 T5 0.04 0.00 0.21 0.00 0.00 0.00
P2 T6 0.00 0.04 0.00 0.00 0.00 0.34
P3 T1 0.20 0.00 0.00 0.00 0.08 0.00
P3 T2 0.00 0.17 0.00 0.00 0.00 0.00
P3 T3 0.00 0.00 0.00 0.08 0.00 0.00
P3 T4 0.00 0.04 0.00 0.04 0.24 0.00
P3 T5 0.00 0.00 0.21 0.00 0.00 0.04
P3 T6 0.00 0.09 0.00 0.00 0.00 0.22
......我想做的是将上面的数据创建到下面的图中:

在这个草图中,条形的高度属于C#s的值,它应该有六种颜色。每个柱状图都属于P#s数据集。
我尝试使用以下代码,将每个P#s数据集复制到.csv文件中,并使用par(mfrow=c(5,3))将其绘制在相同的绘图框架中:
library(e1071)
topics <- read.csv("P1.csv", head=TRUE)
dput(head(topics))
pdf("cosinesimilarityplots.pdf", family="Times")
par(mfrow=c(5,3))
colours <- c("red", "orange", "yellow", "green","blue"," black")
barplot(as.matrix(topics), main="Project Name", ylab="", cex.lab = 1.5, cex.main = 1.4, beside=TRUE, col=colours,ylim=c(0, 0.5))
title(ylab=expression(paste("Cose(", theta, ")")),xlab="Seeded-LDA topics", line=2, cex.lab=1.2)
legend("topleft", c("C10: Resource Management (RM)","C14:Cross Site Scripting (XSS)","C03:Authentication Abuse (AA)","C11:Buffer Overflow (BoF)","C16:Access Privileges (AP)","C08:SQL Injection (SI)"), cex=0.85, bty="n", fill=colours)
dev.off()dput(head(topics))的结果如下:
structure(list(T1 = c(0.24, 0, 0, 0.04, 0.04, 0), T2 = c(0.24,
0.3, 0, 0, 0, 0), T3 = c(0.04, 0.04, 0, 0.24, 0, 0), T4 = c(0,
0, 0, 0.04, 0.33, 0.04), T5 = c(0, 0.09, 0.21, 0, 0, 0), T6 = c(0,
0.09, 0, 0, 0, 0.34)), .Names = c("T1", "T2", "T3", "T4", "T5",
"T6"), row.names = c(NA, 6L), class = "data.frame")然后,我意识到条形图的质量变得非常低,并且在单独的.csv文件中绘制每个P#s数据将永远花费很长时间,特别是当P#s的数量大于15的时候。
如何在不将主数据集文件拆分成更小的文件的情况下有效地绘制它?最好使用R
发布于 2017-02-21 04:20:52
借助dplyr和reshape2的一些帮助,您可以使用ggplot2和gridextra创建类似于您所绘制的图。就像通常的情况一样,仅仅因为你有能力在R中做一些事情并不意味着它是直观的。基本上,您必须为每个项目创建一个单独的绘图对象,剥离图例,然后使用grid.arrange()重新组合所有内容。
library(tidyverse) # ggplot2, dplyr, etc
library(reshape2) # Outdated but still works
library(gridExtra) # Allows us to put plots into grids
# Generate some dummy data
data <- tibble(
Project = rep(paste0("P", 1:6), length = 30),
C10 = abs(rnorm(30)),
C14 = runif(30),
C03 = sample(1:30) / 50,
C11 = rnorm(30) ^ 2,
C16 = abs(rnorm(30) / 2),
C08 = abs(rnorm(30) * 2)
)
data <- data %>%
arrange(Project) %>%
mutate(Topic = rep(paste0("T", 1:5), length = 30))
# Melt the data from wide to long format
data <- melt(data, id.vars = c("Project", "Topic"))
#########################################################
# Now you can actually create the chart
#########################################################
# Use a function to create a version of the plot for each Project
plot_proj <- function(projnum) {
filter(data, Project == projnum) %>%
rename(Legend = variable) %>%
ggplot(., aes(x = Topic, y = value, fill = Legend)) +
geom_bar(stat = "identity", position = "dodge") +
labs(x = "", y = "", title = projnum) +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5),
panel.border = element_blank())
}
# Create a separate plot for each Project
plots <- map(unique(data$Project), plot_proj)
# This function was borrowed from an older StackOverflow answer
# Source: http://stackoverflow.com/questions/13649473/add-a-common-legend-for-combined-ggplots
g_legend <- function(a.gplot) {
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x)
x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
return(legend)
}
mylegend <- g_legend(plots[[1]])
# Combine the plots and add one
grid.arrange(
arrangeGrob(
plots[[1]] + theme(legend.position = "none"),
plots[[2]] + theme(legend.position = "none"),
plots[[3]] + theme(legend.position = "none"),
plots[[4]] + theme(legend.position = "none"),
plots[[5]] + theme(legend.position = "none"),
plots[[6]] + theme(legend.position = "none"),
left = mylegend
)
)https://stackoverflow.com/questions/36824392
复制相似问题