假设我有一个数据表,如下所示:
year city
2026 NYC
2026 NYC
2026 NYC
2026 LA
2027 LA
2028 NYC
2028 NYC它可以通过以下方式创建:
dt <- structure(list(location = c("NYC", "NYC", "NYC","LA", "LA", "NYC", "NYC"),
year = c(2026, 2026, 2026, 2026, 2027, 2028, 2028)),
class = "data.table",
row.names = c(NA, -7L))我想数一下一年里有多少个独特的城市。让我们说2026。因此,这个例子的结果是2,因为只有NYC和LA。后面的最后一行是什么?
dt %>%
filter(year == 2026) %>%
What goes here?发布于 2019-04-30 23:09:04
我们可以使用n_distinct来获得唯一值的数目。
library(dplyr)
dt %>%
filter(year == 2026) %>%
summarise(count = n_distinct(city))
# count
#1 2或者在摘要本身中添加过滤步骤。
dt %>% summarise(count = n_distinct(city[year == 2026]))或者,如果我们希望它作为向量,我们可以添加pull(count)
dt %>%
filter(year == 2026) %>%
summarise(count = n_distinct(city)) %>%
pull(count)
#[1] 2在R基中,这相当于
length(unique(dt$city[dt$year == 2026]))
#[1] 2发布于 2019-05-01 01:58:18
我们可以使用data.table
library(data.table)
setDT(dt)[year == 2026, .(count = uniqueN(location))]
# count
#1: 2或具有`基R‘
length(table(subset(dt, year == 2026, select = location)))
#[1] 2https://stackoverflow.com/questions/55929642
复制相似问题