这是用于创建虚拟对象的libraryI
install.packages("fastDummies")
library(fastDummies)这是数据集
winners <- data.frame(
city = c("SaoPaulito", "NewAmsterdam", "BeatifulCow"),
year = c(1990, 2000, 1990),
crime = 1:3)让他们在这些城市中创建超级假人:
dummy_cols(winners, select_columns = c("city"))结果是
city year crime city_SaoPaulito city_NewAmsterdam city_BeatifulCow
1 SaoPaulito 1990 1 1 0 0
2 NewAmsterdam 2000 2 0 1 0
3 BeatifulCow 1990 3 0 0 1那么问题是,如果我想返回到之前的数据集,有什么想法吗?
提前感谢!
发布于 2019-07-20 22:51:30
我们可以使用dcast
library(data.table)
dcast(setDT(winners), crime ~ city, length)如果我们需要获取输入,它将是
subset(df1, select = 1:3)
# city year crime
#1 SaoPaulito 1990 1
#2 NewAmsterdam 2000 2
#3 BeatifulCow 1990 3或使用melt
melt(setDT(df1), measure = patterns("_"))[value == 1, .(city, year, crime)]
# city year crime
#1: SaoPaulito 1990 1
#2: NewAmsterdam 2000 2
#3: BeatifulCow 1990 3数据
df1 <- structure(list(city = c("SaoPaulito", "NewAmsterdam", "BeatifulCow"
), year = c(1990L, 2000L, 1990L), crime = 1:3, city_SaoPaulito = c(1L,
0L, 0L), city_NewAmsterdam = c(0L, 1L, 0L), city_BeatifulCow = c(0L,
0L, 1L)), class = "data.frame", row.names = c("1", "2", "3"))发布于 2019-07-20 23:01:49
如果每行中只有一个city作为1,则可以跳过虚拟列
df[, 1:3]
# city year crime
#1 SaoPaulito 1990 1
#2 NewAmsterdam 2000 2
#3 BeatifulCow 1990 3如果您可以拥有多个城市,那么使用dplyr和tidyr::gather的一种方式是
library(dplyr)
df %>%
tidyr::gather(key, value, starts_with("city_")) %>%
filter(value == 1) %>%
select(-value, -key)https://stackoverflow.com/questions/57125793
复制相似问题