我正在重新安排R中的一个桌子。
我有52只狮子(例如)。对每头狮子在4个时间点采集的血样进行92种不同标记物的检测。目前,我有一列狮子身份证,长208行,列“样本号”旁边排列,表示采集的时间点,时间1,2,3,4。然后是92个不同标记的血样值。总共94列(ID、样本号和标记类型)。
见数据:
'ID' 'Sample Number' 'Marker type'
-Lion 1 time 1 Marker 1 Marker 2 Marker 3...Marker 92
-Lion 1 time 2 Marker 1 Marker 2 Marker 3...Marker 92
-Lion 1 time 3 Marker 1 Marker 2 Marker 3...Marker 92
-Lion 1 time 4 Marker 1 Marker 2 Marker 3...Marker 92
-Lion 2 time 1 Marker 1 Marker 2 Marker 3...Marker 92
-Lion 2 time 2 Marker 1 Marker 2 Marker 3...Marker 92
-Lion 2 time 3 Marker 1 Marker 2 Marker 3...Marker 92
-Lion 2 time 4 Marker 1 Marker 2 Marker 3...Marker 92
-Lion 3 time 1 Marker 1 Marker 2 Marker 3...Marker 92
-Lion 3 time 2 Marker 1 Marker 2 Marker 3...Marker 92
-Lion 3 time 3 Marker 1 Marker 2 Marker 3...Marker 92
-Lion 3 time 4 Marker 1 Marker 2 Marker 3...Marker 92我需要修改它,这样它就为52个狮子it (而不是每头狮子4行)提供了一个列,然后对于每个92个标记,为示例编号提供4列,总共提供369列。
预期产出数据:
'ID' 'Sample Number' 'Marker type'
lion 1 time 1 marker 1 time 2 marker 1 time 3 marker 1 time 4 marker 1
lion 2 time 1 marker 2 time 2 marker 2 time 3 marker 2 time 4 marker 2
lion 3 time 1 marker 3 time 2 marker 3 time 3 marker 3 time 4 marker 3我不想让时间1标记1的新变量,而是标记1的列,分裂成4列的时间,1行的狮子。标记2也是如此,等等。
发布于 2020-01-15 01:41:49
我想我们可以在这里使用pivot_wider:
tidyr::pivot_wider(df, names_from = Samp_Num, values_from = Mark1:Mark3))
#OR
#tidyr::pivot_wider(df, names_from = Samp_Num, values_from = starts_with("Mark"))
# A tibble: 3 x 13
# ID Mark1_time1 Mark1_time2 Mark1_time3 Mark1_time4 Mark2_time1 Mark2_time2
# <fct> <fct> <fct> <fct> <fct> <fct> <fct>
#1 Lion1 Marker1 Marker1 Marker1 Marker1 Marker2 Marker2
#2 Lion2 Marker1 Marker1 Marker1 Marker1 Marker2 Marker2
#3 Lion3 Marker1 Marker1 Marker1 Marker1 Marker2 Marker2
# … with 6 more variables: Mark2_time3 <fct>, Mark2_time4 <fct>,
# Mark3_time1 <fct>, Mark3_time2 <fct>, Mark3_time3 <fct>, Mark3_time4 <fct>在我的示例数据中,我们有3列Mark列,现在在最终输出中有13列(每次3*4列+ 1)。同样,对于实际数据,您应该有369列(92 *4+ 1)。
数据
创建了一个小样本数据
df <- structure(list(ID = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L), .Label = c("Lion1", "Lion2", "Lion3"), class = "factor"),
Samp_Num = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L,
2L, 3L, 4L), .Label = c("time1", "time2", "time3", "time4"
), class = "factor"), Mark1 = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Marker1", class = "factor"),
Mark2 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L), .Label = "Marker2", class = "factor"), Mark3 = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Marker3", class = "factor")),
class = "data.frame", row.names = c(NA, -12L))https://stackoverflow.com/questions/59743897
复制相似问题