我有一个数据,它由两列和这些列上的一些重复项组成。我想删除每一列的重复项,然后收集所有唯一的值,保留列名。
data<-structure(c(10L, 10L, 11L, 11L, 5L, 5L, 3L, 5L), .Dim = c(2L,
4L), .Dimnames = list(c("d1", "m1"), c("year2036", "year2037",
"year2038", "year2039")))
year2036 year2037 year2038 year2039
d1 10 11 5 3
m1 10 11 5 5输出将如下所示:
year2036 year2037 year2038 year2039 year2039
10 11 5 3 5
out<-structure(c(10, 11, 5, 3, 5), .Names = c("year2036", "year2037",
"year2038", "year2039", "year2039"))我尝试过unique(r[c(1:8)]),但它只是给出了唯一的数字,删除了列名。
发布于 2021-08-16 11:38:00
您可以在apply中使用unique并对结果执行stack。
stack(apply(data, 2, unique))
# values ind
#1 10 year2036
#2 11 year2037
#3 5 year2038
#4 3 year2039
#5 5 year2039或者使用您想要的格式:
x <- stack(apply(data, 2, unique))
setNames(x$values, x$ind)
#year2036 year2037 year2038 year2039 year2039
# 10 11 5 3 5 发布于 2021-08-16 11:33:41
data %>%
as_tibble() %>%
pivot_longer(everything()) %>%
group_by(name) %>%
distinct(value)
# A tibble: 5 x 2
# Groups: name [4]
name value
<chr> <int>
1 year2036 10
2 year2037 11
3 year2038 5
4 year2039 3
5 year2039 5发布于 2021-08-16 11:39:36
让数据具有相同的列名并不是一种好的做法。这是一个解决方案,它提供了与预期输出相同的结构,但修改了列名。
library(dplyr)
library(tidyr)
data %>%
as.data.frame() %>%
pivot_longer(cols = everything()) %>%
distinct() %>%
mutate(row = data.table::rowid(name)) %>%
pivot_wider(names_from = c(name, row), values_from = value)
# year2036_1 year2037_1 year2038_1 year2039_1 year2039_2
# <int> <int> <int> <int> <int>
#1 10 11 5 3 5 https://stackoverflow.com/questions/68801482
复制相似问题