这很难编码,更难以解释。如果我的解释令人困惑,我很抱歉,我会尝试用这种方式解释数据。
我有一个有3列的数据集
ID Vaccine Time
1 A Winter
1 B Spring
2 A Spring
2 B Winter
2 B Fall
3 C Fall
3 A Fall
3 B Fall
4 A Winter
4 A Spring
5 A Winter正如你所看到的
的3种疫苗中的任何一种或全部
- A total of 5 patients took Vaccine A- 4 patients (patient 1, patient 2, patient 3, patient 5) took the vaccine only once冬季
- Patient 1
- Patient 5春天
- Patient 2坠落
- Patient 3- 1 patient (patient 4) took the vaccine twice during Winter and springVaccine Winter.Only Spring.Only Fall.Only Winter.Spring Winter.Fall Spring.Fall
A 2 1 1 1 - A total of 3 patients took Vaccine B- 2 patients took the vaccine only once (patient 1, patient 3)弹簧
- Patient 1坠落
- Patient 3- 1 patient (patient 2) took the vaccine twice during Winter and FallVaccine Winter.Only Spring.Only Fall.Only Winter.Spring Winter.Fall Spring.Fall
B 1 1 1 - A total of 1 patient took Vaccine C- 1 patient took the vaccine only once (patient 3)坠落
- Patient 3Vaccine Winter.Only Spring.Only Fall.Only Winter.Spring Winter.Fall Spring.Fall
C 1 最后的数据集应该如下所示
Vaccine Winter.Only Spring.Only Fall.Only Winter.Spring Winter.Fall Spring.Fall
A 2 1 1 1
B 1 1 1
C 1我主要尝试为每种疫苗(行)创建一个数据集,有多少患者只接种了一种疫苗,何时(冬季、春季、秋季)以及有多少患者使用了两、三种相同类型的疫苗(Winter.Spring、Winter.Fall或Spring )。Fall或Winter.Spring.Fall)
任何关于如何做到这一点的想法或建议都是非常感谢的。
发布于 2022-02-26 07:41:22
library(dplyr); library(tidyr)
df %>%
arrange(Time) %>%
group_by(ID, Vaccine) %>%
summarize(Times = paste(Time, collapse = "_and_"), .groups = "drop") %>%
count(Vaccine, Times) %>%
pivot_wider(names_from = Times, values_from = n)结果
# A tibble: 3 x 6
Vaccine Fall Spring Spring_and_Winter Winter Fall_and_Winter
<chr> <int> <int> <int> <int> <int>
1 A 1 1 1 2 NA
2 B 1 1 NA NA 1
3 C 1 NA NA NA NAhttps://stackoverflow.com/questions/71274663
复制相似问题