我有一个数据集,我正在使用dplyr包在R中进行编辑。我的代码是:
hiphop%>%
mutate( sex =
case_when(
sex == 1 ~ "female",
sex == 0 ~ "male"
)
)%>%
group_by(sex)%>%
summarise_at(vars(intl,vocal,classical,folk,rock,country,pop,alternative,hiphop,unclassifiable),funs(mean))%>%
pivot_longer(c(intl,vocal,classical,folk,rock,country,pop,alternative,hiphop,unclassifiable),names_to = "genre")%>%
spread(sex,value)%>%
mutate(
genredifference = abs(female-male)
)%>%
arrange(genredifference)%>%
top_n(3)我从哪里得到这个输出:
Selecting by genredifference
# A tibble: 3 x 4
genre female male genredifference
<chr> <dbl> <dbl> <dbl>
1 country 0.786 0.392 0.394
2 vocal 0.880 1.57 0.688
3 rock 1.93 3.06 1.13 我希望得到相同的输出,但是用pivot_wider() (我相信这就是要用到的)来替换spread()函数。然而,我想不出该怎么做。
谢谢!
附言:这是我的数据集,如果你感兴趣的话:
hiphop <- read_csv("https://www.dropbox.com/s/5d8fwxrj3jtua1z/hiphop.csv?dl=1")发布于 2020-01-22 06:34:15
基于dropbox输入数据,一些步骤已经完成。我们可以通过使用select_helpers来使一些步骤更紧凑,也就是说,如果我们有一系列要选择的列,可以使用:,类似地,在pivot_longer中,我们也可以使用-指定不被选择的列。使用pivot_wider时,请确保指定参数(names_from、values_from),因为还有其他参数,并且不指定参数,它可以按出现的顺序与参数匹配
library(dplyr)
library(tidyr)
hiphop %>%
group_by(sex)%>%
summarise_at(vars(intl:unclassifiable), mean) %>%
pivot_longer(cols = -sex) %>%
pivot_wider(names_from = sex, values_from = value) %>%
mutate(genredifference = abs(Female-Male))%>%
arrange(genredifference)%>%
top_n(3)
# A tibble: 3 x 4
# name Female Male genredifference
# <chr> <dbl> <dbl> <dbl>
#1 country 0.786 0.392 0.394
#2 vocal 0.880 1.57 0.688
#3 rock 1.93 3.06 1.13 https://stackoverflow.com/questions/59850140
复制相似问题