我正在使用英国理解协会的数据集,并尝试链接/合并父母和孩子的信息。
父母和孩子的信息在一个单独的数据文件中,所以我通过使用父母的唯一标识符和青年文件中的“母亲/父亲标识符”将孩子文件与父母信息链接起来。在同时包含父信息和子信息的新数据框中,存在重复
即
Personal_ID <- c(101,102)
Youth_Personal_ID <- c(200,200)
Youth_reading <- c("once a week", "once a week")
Parent_education <- c("bachelors","HS diploma" )
example <- data.frame(Youth_Personal_ID,Personal_ID,Parent_education,Youth_reading) Youth_Personal_ID Personal_ID Parent_education Youth_reading
1 200 101 bachelors once a week
2 200 102 HS diploma once a week有没有一种方法可以通过使用父标识符像这样重新构造它?
Youth_Personal_ID Youth_reading Mother_education Father_education
1 200 once a week bachelors HS diploma发布于 2021-10-04 13:58:38
如果每个孩子总是有两个id,并且个人id的顺序总是‘母亲’和‘父亲’,你可以这样做-
library(dplyr)
library(tidyr)
example %>%
group_by(Youth_Personal_ID) %>%
mutate(Personal_ID = c('Mother_education', 'Father_education')) %>%
pivot_wider(names_from = Personal_ID, values_from = Parent_education)
# Youth_Personal_ID Youth_reading Mother_education Father_education
# <dbl> <chr> <chr> <chr>
#1 200 once a week bachelors HS diploma 发布于 2021-10-04 14:00:38
由于您最多只有2个父ID,因此这应该是可行的,
library(dplyr)
example %>%
group_by(Youth_Personal_ID) %>%
mutate(father_ed = last(Parent_education)) %>%
slice(1L)发布于 2021-10-04 16:32:16
我们也可以这样做
library(dplyr)
library(tidyr)
library(stringr)
example %>%
mutate(Personal_ID = rep(str_c(c('Mother_', 'Father_'), 'education'),
length.out = n())) %>%
pivot_wider(names_from = Personal_ID, values_from = Parent_education)
# A tibble: 1 × 4
Youth_Personal_ID Youth_reading Mother_education Father_education
<dbl> <chr> <chr> <chr>
1 200 once a week bachelors HS diploma https://stackoverflow.com/questions/69437202
复制相似问题