首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >避免猜测reshape()中的“变化”

避免猜测reshape()中的“变化”
EN

Stack Overflow用户
提问于 2016-02-01 15:35:15
回答 1查看 695关注 0票数 1

我试图在R中reshape()一些时变数据,我正在处理以下数据集:

代码语言:javascript
复制
dframe <- structure(list(participant_id = structure(c(48L, 43L, 51L, 28L, 35L, 65L), .Label = c("PRA", "RA", "ASD", "LAD", "ASDGZV ", "RAGSD", "GREA", "SDFDSA", "DSFG", "FHJ", "RQGA", "AESFD", "RGAV", "FGHDF", "HSGD", "FDGH", "ASDF", "AGSD", "SADF", "SADF", "SF", "XV", "ASDCV", "ASDF", "ASDG", "SDF", "XCVZ", "ZXCV", "ASGV", "SAFDV", "ASDF", "SDFV", "SAFD", "SAFD", "AGS", "FDSGVX", "WAFDS", "DSAZC", "SADCZX", "SADFCX", "DSAFC", "FDSGV", "ADSCXZ", "SDFACZ", "SADFCZ", "AFSDZX", "EAWFDSZ", "FDVCZX", "SADZC", "FSADCZ", "AESFDZC", "WAFDSZC", "SDFC", "FSADC", "DSZXC", "SDAFC", "AFSDZC", "WFADS", "FSDVC", "GSDHBXC", "EFWADSCXZ", "EWAFDSC", "AFDSCZ", "AWEFDC", "AGSFV"), class = "factor"), baseline_pupilsize = c(6, 6, 7, 6, 6, 6), baseline_coe = c(11.19, 13.6, 3.96, 7.64, 6.12, 6.92), baseline_rcb = c(16.74, 25, 25, 18.37, 25, 25), final_pop = c(NA, NA, 7.1, 8, 6, NA), final_coe = c(NA, NA, 5.9263624, 4.89, 11.98, NA), final_rcb = c(NA, NA, 25L, NA, NA, NA)), .Names = c("participant_id", "baseline_pop", "baseline_coe", "baseline_rcb", "final_pop", "final_coe", "final_rcb"), row.names = c(NA, 6L), class = "data.frame")

这些是纵向研究中的时变数据,也是我从源文件中导入的更大数据集的子集。我想提取popcoercb值,用于baselinefinal研究访问(在我完整的数据集中,有几次访问,为了这个问题,我省略了这些访问)。

我可以做到以下几点:

代码语言:javascript
复制
reshape(dframe,idvar='participant_id',v.names = c('pop','coe','rcb'),varying = 2:length(dframe),direction='long')

但是,这最终会导致pop中的值被标记为coereshape2的文档告诉我,我应该显式引用varying值,以避免“猜测”。所以,我试一试:

代码语言:javascript
复制
reshape(dframe,idvar='participant_id',v.names = c('pop','coe','rcb'),varying = c('baseline_pop','baseline_coe','baseline_rcb','final_pop','final_coe','final_rcb'),direction='long')

这将导致完全相同的输出,尽管显式地命名了varying参数。我做错了什么?据推测,由于字母化,pop最终得到了coe的值,但我不明白为什么会这样,因为我现在已经明确声明了varying参数.

编辑:预期的输出如下:

代码语言:javascript
复制
participant_id  time    pop coe         rcb
FDVCZX          1       6   11.19       16.74
ADSCXZ          1       6   13.6        25
AESFDZC         1       7   3.96        25
ZXCV            1       6   7.64        18.37
AGS             1       6   6.12        25
AGSFV           1       6   6.92        25
FDVCZX          2       NA  NA          NA
ADSCXZ          2       NA  NA          NA
AESFDZC         2       7.1 5.926362    25
ZXCV            2       8   4.89        NA
AGS             2       6   11.98       NA
AGSFV           2       NA  NA          NA

但是,正如您将看到的,pop值最终出现在coe列中,反之亦然。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2016-02-01 15:51:29

我们可以使用来自data.tabledata.table,它可以使用多个measure列。

代码语言:javascript
复制
library(data.table)
melt(setDT(dframe), measure=patterns('pop', 'coe', 'rcb'), 
     value.name = c('pop', 'coe', 'rcb'), variable.name='time')
#    participant_id time pop       coe   rcb
# 1:         FDVCZX    1 6.0 11.190000 16.74
# 2:         ADSCXZ    1 6.0 13.600000 25.00
# 3:        AESFDZC    1 7.0  3.960000 25.00
# 4:           ZXCV    1 6.0  7.640000 18.37
# 5:            AGS    1 6.0  6.120000 25.00
# 6:          AGSFV    1 6.0  6.920000 25.00
# 7:         FDVCZX    2  NA        NA    NA
# 8:         ADSCXZ    2  NA        NA    NA
# 9:        AESFDZC    2 7.1  5.926362 25.00
#10:           ZXCV    2 8.0  4.890000    NA
#11:            AGS    2 6.0 11.980000    NA
#12:          AGSFV    2  NA        NA    NA
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/35134360

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档