您好,我有一个文本文件summary.txt,如下所示
snp_fp_overlapp_count: 0
snp_fn_overlapp_count: 0
snp_fn_ca_overlapp_count: 0
snp_fn_ca_0: 0
indel_fp_overlapp_count: 0
indel_fn_overlapp_count: 0
indel_fn_ca_overlapp_count: 0
indel_fn_ca_0: 0
-------------------------------------
snp_hard_count: 0
indel_hard_count: 0
unknown_count: 0
-------------------------------------
total_baseline_snp: 12405
total_baseline_indel: 1137
-------------------------------------
Precision_SNP: 0.790329
Sensitivity_SNP: 0.972350
F_Measure_SNP: 0.871941
-------------------------------------
Precision_INDEL: 0.119915
Sensitivity_INDEL: 0.941073
F_Measure_INDEL: 0.212724
-------------------------------------
Annotation Percent Match:
-------------------------------------
-------------------------------------我需要将指标名称转换为列,并将其值转换为R中的第二行,如下所示
Precision_INDEL Sensitivity_INDEL F_Measure_INDEL
0.119915 0.941073 0.212724不知道该怎么做,我一直在纠结于tidyverse,这似乎不是解决方案
mydata = read_table("summary.txt")
mydata %>%
rownames_to_column() %>%
gather(variable, value, -rowname) %>%
separate(variable ,sep = ":",into = c("metric","value")) %>%
spread(rowname, value)任何指针/解决方案都会很有用
谢谢
发布于 2021-06-25 23:32:37
这是你需要的吗?我不确定你所说的‘以表格形式’是什么意思。
dat = data.table::fread('snp_fp_overlapp_count: 0
snp_fn_overlapp_count: 0
snp_fn_ca_overlapp_count: 0
snp_fn_ca_0: 0
indel_fp_overlapp_count: 0
indel_fn_overlapp_count: 0
indel_fn_ca_overlapp_count: 0
indel_fn_ca_0: 0
snp_hard_count: 0
indel_hard_count: 0
unknown_count: 0
total_baseline_snp: 12405
total_baseline_indel: 1137
Precision_SNP: 0.790329
Sensitivity_SNP: 0.972350
F_Measure_SNP: 0.871941
Precision_INDEL: 0.119915
Sensitivity_INDEL: 0.941073
F_Measure_INDEL: 0.212724
Annotation Percent Match:
', sep=" ")a = dat %>% tidyr::pivot_wider(names_from="V1", values_from="V2")# A tibble: 1 x 19
`snp_fp_overlapp_co… `snp_fn_overlapp_co… `snp_fn_ca_overlapp_… `snp_fn_ca_0:`
<dbl> <dbl> <dbl> <dbl>
1 0 0 0 0
# … with 15 more variables: indel_fp_overlapp_count: <dbl>,
# indel_fn_overlapp_count: <dbl>, indel_fn_ca_overlapp_count: <dbl>,
# indel_fn_ca_0: <dbl>, snp_hard_count: <dbl>, indel_hard_count: <dbl>,
# unknown_count: <dbl>, total_baseline_snp: <dbl>,
# total_baseline_indel: <dbl>, Precision_SNP: <dbl>, Sensitivity_SNP: <dbl>,
# F_Measure_SNP: <dbl>, Precision_INDEL: <dbl>, Sensitivity_INDEL: <dbl>,
# F_Measure_INDEL: <dbl>> colnames(a)
[1] "snp_fp_overlapp_count:" "snp_fn_overlapp_count:"
[3] "snp_fn_ca_overlapp_count:" "snp_fn_ca_0:"
[5] "indel_fp_overlapp_count:" "indel_fn_overlapp_count:"
[7] "indel_fn_ca_overlapp_count:" "indel_fn_ca_0:"
[9] "snp_hard_count:" "indel_hard_count:"
[11] "unknown_count:" "total_baseline_snp:"
[13] "total_baseline_indel:" "Precision_SNP:"
[15] "Sensitivity_SNP:" "F_Measure_SNP:"
[17] "Precision_INDEL:" "Sensitivity_INDEL:"
[19] "F_Measure_INDEL:"
> as.numeric(a[1,])
[1] 0.000000 0.000000 0.000000 0.000000 0.000000
[6] 0.000000 0.000000 0.000000 0.000000 0.000000
[11] 0.000000 12405.000000 1137.000000 0.790329 0.972350
[16] 0.871941 0.119915 0.941073 0.212724https://stackoverflow.com/questions/68133118
复制相似问题