我有一个数据集中,列中有重复行例如,在数据帧的前两行中,除活动之外,从细胞、药物到路径的所有内容都是相同的,第一行在活动列中具有抗性,第二行在活动列中具有敏感性。我想保留第二行,它在活动中有一个敏感的输出。
请你协助我如何去做那件事。我想对数据帧中的所有具有这样一个输出的行这样做,我想保留第二个重复的行。
**cell** **drug** **pathway** **activity**
AU656 5-FLORO OTHER RESISTANT
AU656 5-FLORO OTHER SENSITIVE
AU656 ALISERTIB MITOSIS INTERMEDIATE
AU656 ALISERTIB MITOSIS RESISTANT
AU656 AFITINIB EGFR SENSITIVE
AU656 AZD6482 PI3K INTERMEDIATE
AU656 DORAMAPIMOD JNK INTERMEDIATE
AU656 DORAMAPIMOD JNK SENSITIVE发布于 2022-10-19 18:52:49
我们按细胞、药物、路径和slice分组(如果存在的话),取minimum为2和组大小(n()),因此对于1的组大小,则返回第一行。
library(dplyr)
df1 %>%
group_by(cell, drug, pathway) %>%
slice(min(2, n())) %>%
ungroup-output
# A tibble: 5 × 4
cell drug pathway activity
<chr> <chr> <chr> <chr>
1 AU656 5-FLORO OTHER SENSITIVE
2 AU656 AFITINIB EGFR SENSITIVE
3 AU656 ALISERTIB MITOSIS RESISTANT
4 AU656 AZD6482 PI3K INTERMEDIATE
5 AU656 DORAMAPIMOD JNK SENSITIVE 数据
df1 <- structure(list(cell = c("AU656", "AU656", "AU656", "AU656", "AU656",
"AU656", "AU656", "AU656"), drug = c("5-FLORO", "5-FLORO", "ALISERTIB",
"ALISERTIB", "AFITINIB", "AZD6482", "DORAMAPIMOD", "DORAMAPIMOD"
), pathway = c("OTHER", "OTHER", "MITOSIS", "MITOSIS", "EGFR",
"PI3K", "JNK", "JNK"), activity = c("RESISTANT", "SENSITIVE",
"INTERMEDIATE", "RESISTANT", "SENSITIVE", "INTERMEDIATE", "INTERMEDIATE",
"SENSITIVE")), class = "data.frame", row.names = c(NA, -8L))https://stackoverflow.com/questions/74130388
复制相似问题