我有这个数据框架:
dput(df)
structure(list(Server = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "servera", class = "factor"),
Date = structure(1:6, .Label = c("7/13/2017 15:01", "7/13/2017 15:02",
"7/13/2017 15:03", "7/13/2017 15:04", "7/13/2017 15:05",
"7/13/2017 15:06"), class = "factor"), Host_CPU = c(1.812950134,
2.288070679, 1.563278198, 1.925239563, 5.350669861, 2.612503052
), UsedMemPercent = c(38.19, 38.19, 38.19, 38.19, 38.19,
38.22), jvm1 = c(10.91, 11.13, 11.34, 11.56, 11.77, 11.99
), jvm2 = c(11.47, 11.7, 11.91, 12.13, 12.35, 12.57), jvm3 = c(75.65,
76.88, 56.93, 58.99, 65.29, 67.97), jvm4 = c(39.43, 40.86,
42.27, 43.71, 45.09, 45.33), jvm5 = c(27.42, 29.63, 31.02,
32.37, 33.72, 37.71)), .Names = c("Server", "Date", "Host_CPU",
"UsedMemPercent", "jvm1", "jvm2", "jvm3", "jvm4", "jvm5"), class = "data.frame", row.names = c(NA,
-6L))我只希望能够根据这个变量中的向量名称来设置这个数据框的子集:
select<-c("jvm3", "jvm4", "jvm5")所以,我的最终df应该是这样的:
structure(list(Server = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "servera", class = "factor"),
Date = structure(1:6, .Label = c("7/13/2017 15:01", "7/13/2017 15:02",
"7/13/2017 15:03", "7/13/2017 15:04", "7/13/2017 15:05",
"7/13/2017 15:06"), class = "factor"), Host_CPU = c(1.812950134,
2.288070679, 1.563278198, 1.925239563, 5.350669861, 2.612503052
), UsedMemPercent = c(38.19, 38.19, 38.19, 38.19, 38.19,
38.22), jvm3 = c(75.65, 76.88, 56.93, 58.99, 65.29, 67.97
), jvm4 = c(39.43, 40.86, 42.27, 43.71, 45.09, 45.33), jvm5 = c(27.42,
29.63, 31.02, 32.37, 33.72, 37.71)), .Names = c("Server",
"Date", "Host_CPU", "UsedMemPercent", "jvm3", "jvm4", "jvm5"), class = "data.frame", row.names = c(NA,
-6L))有什么想法吗?
发布于 2017-07-15 01:59:58
将数据帧保存到变量df:
df <-
structure(
list(
Server = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "servera", class = "factor"),
Date = structure(
1:6,
.Label = c(
"7/13/2017 15:01",
"7/13/2017 15:02",
"7/13/2017 15:03",
"7/13/2017 15:04",
"7/13/2017 15:05",
"7/13/2017 15:06"
),
class = "factor"
),
Host_CPU = c(
1.812950134,
2.288070679,
1.563278198,
1.925239563,
5.350669861,
2.612503052
),
UsedMemPercent = c(38.19, 38.19, 38.19, 38.19, 38.19,
38.22),
jvm1 = c(10.91, 11.13, 11.34, 11.56, 11.77, 11.99),
jvm2 = c(11.47, 11.7, 11.91, 12.13, 12.35, 12.57),
jvm3 = c(75.65,
76.88, 56.93, 58.99, 65.29, 67.97),
jvm4 = c(39.43, 40.86,
42.27, 43.71, 45.09, 45.33),
jvm5 = c(27.42, 29.63, 31.02,
32.37, 33.72, 37.71)
),
.Names = c(
"Server",
"Date",
"Host_CPU",
"UsedMemPercent",
"jvm1",
"jvm2",
"jvm3",
"jvm4",
"jvm5"
),
class = "data.frame",
row.names = c(NA,-6L)
)df[,select]应该就是你想要的
发布于 2017-07-15 02:51:11
请重新访问索引。如果在R中使用索引机制[,则可以主要使用三种类型的索引:
TRUE表示选择列向量:基于名称向量选择列:基于名称选择列如果对数据帧使用索引机制,则可以用两种方式处理这些对象:
以iris数据框为例,比较从数据框中选择列的多种方法。如果将其视为列表,则有以下两个选项:
如果需要矢量形式的单个列,请使用[[:
iris[["Species"]]
# [1] setosa setosa setosa ... : is a vector如果您想要一个或多个列,但需要返回数据框,请使用[:
iris["Species"]
iris[c("Sepal.Width", "Species")]如果你把它当作一个矩阵,你只需要做与处理矩阵完全相同的事情。如果您不指定任何行索引,则这些命令实际上等同于上面使用的命令:
iris[ , "Species"] # is the same as iris[["Species"]]
iris[ , "Species", drop = FALSE] # is the same as iris["Species"]
iris[ , c("Sepal.Width", "Species")] # is the same as iris[c("Sepal.Width", "Species")]因此,在您的情况下,您只需要:
select <- c("Server","Date","Host_CPU","UsedMemPercent",
"jvm3","jvm4","jvm5")
df[select]关于subset的说明:subset可以工作,但只能交互使用。帮助页面上有一个警告,说明:
这是一个旨在以交互方式使用的方便函数。对于编程,最好使用[这样的标准子设置函数,尤其是参数子集的非标准计算可能会产生意想不到的结果。
发布于 2017-07-15 02:13:40
这里有一种方法:
df[,c(1:4,7:9)]
还可以使用dplyr选择列:
select(df, Server,Date,Host_CPU,UsedMemPercent,jvm3,jvm4,jvm5)
https://stackoverflow.com/questions/45109061
复制相似问题