首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >将两个sampleID的相应值连接到新的单个列中

将两个sampleID的相应值连接到新的单个列中
EN

Stack Overflow用户
提问于 2018-05-29 01:43:53
回答 2查看 49关注 0票数 3

我有一个数据帧sampleManifest,看起来像这样:

代码语言:javascript
复制
SampleName          Status          Role          Sex
AU056001_00HI1299A  unaffected      sibling       female
AU056002_00HI1301A  unaffected      proband       male  
AU0780201_00HI1775A unaffected      father        male  
AU0780202_00HI1777A unaffected      mother        female
AU0780301_00HI1778A affected        proband       male  
.
.
.

和单独的成对样本比较数据帧,kinshipEstimates

代码语言:javascript
复制
FID    ID1                      ID2             Kinship Relationship    
AU0560 AU056001_00HI1299A  AU056002_00HI1301A   0.0283  full-sibling   
AU0780 AU0780201_00HI1775A AU0780202_00HI1777A -0.00160 unrelated   
AU0780 AU0780201_00HI1775A AU0780301_00HI1778A  0.284   parent-child
AU0780 AU0780202_00HI1777A AU0780301_00HI1778A  0.246   parent-child
.
.
.

我想构建一个新的数据帧,其中包含kinshipEstimates每行中两个样本的sampleManifest$Role,因此它看起来像这样:

代码语言:javascript
复制
FID    ID1                      ID2             Roles           Kinship Relationship    
AU0560 AU056001_00HI1299A  AU056002_00HI1301A   sibling-proband 0.0283  full-sibling   
AU0780 AU0780201_00HI1775A AU0780202_00HI1777A  father-mother  -0.00160 unrelated   
AU0780 AU0780201_00HI1775A AU0780301_00HI1778A  father-proband  0.284   parent-child
AU0780 AU0780202_00HI1777A AU0780301_00HI1778A  mother-proband  0.246   parent-child
.
.
.

我一直在尝试使用left_join,但不知道如何将成对的每个样本的相应Role合并为一个值。

EN

回答 2

Stack Overflow用户

发布于 2018-05-29 02:47:14

一种解决方案是通过tidyverse包使用双left_join。首先在ID1SampleName上加入kinshipEstimatessampleManifest。再次在ID2SampleName上使用result加入sampleManifest。最后,使用tidyr::unite合并Role.xRole.y

代码语言:javascript
复制
library(tidyverse)

left_join(kinshipEstimates, sampleManifest, by=c("ID1" = "SampleName")) %>%
  select(-Status, -Sex) %>%
  left_join(sampleManifest, by=c("ID2" = "SampleName")) %>%
  unite(Roles, Role.x, Role.y, sep="-") %>%
  select(-Sex, -Status)


#      FID                 ID1                 ID2 Kinship Relationship           Roles
# 1 AU0560  AU056001_00HI1299A  AU056002_00HI1301A  0.0283 full-sibling sibling-proband
# 2 AU0780 AU0780201_00HI1775A AU0780202_00HI1777A -0.0016    unrelated   father-mother
# 3 AU0780 AU0780201_00HI1775A AU0780301_00HI1778A  0.2840 parent-child  father-proband
# 4 AU0780 AU0780202_00HI1777A AU0780301_00HI1778A  0.2460 parent-child  mother-proband

数据:

代码语言:javascript
复制
sampleManifest <- read.table(text = 
"SampleName          Status          Role          Sex
AU056001_00HI1299A  unaffected      sibling       female
AU056002_00HI1301A  unaffected      proband       male  
AU0780201_00HI1775A unaffected      father        male  
AU0780202_00HI1777A unaffected      mother        female
AU0780301_00HI1778A affected        proband       male",
stringsAsFactors = FALSE, header = TRUE)

kinshipEstimates <- read.table(text = 
"FID    ID1                      ID2             Kinship Relationship    
AU0560 AU056001_00HI1299A  AU056002_00HI1301A   0.0283  full-sibling   
AU0780 AU0780201_00HI1775A AU0780202_00HI1777A -0.00160 unrelated   
AU0780 AU0780201_00HI1775A AU0780301_00HI1778A  0.284   parent-child
AU0780 AU0780202_00HI1777A AU0780301_00HI1778A  0.246   parent-child",
stringsAsFactors = FALSE, header = TRUE)
票数 3
EN

Stack Overflow用户

发布于 2018-05-29 09:17:29

这里有一种使用gather、one inner_joingroup_by的方法。

添加行号允许我们在分组时跟踪ID1/ID2对:

代码语言:javascript
复制
kinshipEstimates %>%
  mutate(row_num = row_number()) %>%
  gather(which_id, id, -row_num, -FID, -Kinship, -Relationship) %>%
  inner_join(sampleManifest, by=c("id" = "SampleName")) %>%
  group_by(FID, row_num) %>%
  summarise(Roles = paste(Role, collapse="-"),
            Kinship = first(Kinship),
            Relationship = first(Relationship))

  FID    row_num Roles            Kinship Relationship
  <chr>    <int> <chr>              <dbl> <chr>       
1 AU0560       1 sibling-proband  0.0283  full-sibling
2 AU0780       2 father-mother   -0.00160 unrelated   
3 AU0780       3 father-proband   0.284   parent-child
4 AU0780       4 mother-proband   0.246   parent-child
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/50571304

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档