首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >下载TCGA数据的困难

下载TCGA数据的困难
EN

Stack Overflow用户
提问于 2020-01-18 08:20:30
回答 1查看 397关注 0票数 1

我正在尝试下载TCGA数据,但我得到了以下错误:

错误在summarizeMaf(maf = maf,anno = clinicalData,chatty =详细):Tumor_Sample_Barcode列没有发现提供的临床数据。如果需要,将包含示例名称的列重命名为Tumor_Sample_Barcode。

这是我的密码:

代码语言:javascript
复制
library("TCGAbiolinks")
library("tidyverse")
library(maftools)
query <- GDCquery(   project = "TCGA-LIHC",  
                     data.category = "Clinical", 
                     file.type = "xml", 
                     legacy = FALSE)
GDCdownload(query,directory = ".")

clinical <- GDCprepare_clinic(query, clinical.info = "patient",directory = ".")
#getting the survival time of event data
survival_data <- as_tibble(clinical[,c("days_to_last_followup","days_to_death","vital_status","bcr_patient_barcode","patient_id")]) 
survival_data <- filter(survival_data,!is.na(days_to_last_followup)|!is.na(days_to_death))  #not both NA
survival_data <- filter(survival_data,!is.na(days_to_last_followup)|days_to_last_followup>0 &is.na(days_to_death)|days_to_death > 0 ) #ensuring positive values
survival_data <- survival_data[!duplicated(survival_data$patient_id),]  #ensuring no duplicates


dim(survival_data) #should be 371


maf <- GDCquery_Maf("LIHC", pipelines = "muse")
#maf <- GDCquery_Maf("LIHC", pipelines = "somaticsniper")

#clin <- GDCquery_clinic("TCGA-LIHC","clinical")
#print(clin )



laml = read.maf(
  maf,
  clinicalData = clinical,
  removeDuplicatedVariants = TRUE,
  useAll = TRUE,
  gisticAllLesionsFile = NULL,
  gisticAmpGenesFile = NULL,
  gisticDelGenesFile = NULL,
  gisticScoresFile = NULL,
  cnLevel = "all",
  cnTable = NULL,
  isTCGA = TRUE,
  vc_nonSyn = NULL,
  verbose = TRUE
)
EN

回答 1

Stack Overflow用户

发布于 2020-01-28 22:21:51

您应该有:( a)加载了library(maftools)和b)包含在该错误消息之前打印出来的内容:

代码语言:javascript
复制
-Validating
-Silent variants: 18306 
-Summarizing
--Possible FLAGS among top ten genes:
  TTN
  MUC16
  OBSCN
  FLG
-Processing clinical data
Available fields in provided annotations..
 [1] "bcr_patient_barcode"                              "additional_studies"                              
 [3] "tissue_source_site"                               "patient_id" 
# snipped remaining 78 column names      

请注意,第一列没有命名为"Tumor_Sample_Barcode",因此您需要遵循有帮助的错误消息指示,并重命名相应的列,这似乎是第一列:

代码语言:javascript
复制
ns. After doing so I get:

-Validating
-Silent variants: 18306 
-Summarizing
--Possible FLAGS among top ten genes:
  TTN
  MUC16
  OBSCN
  FLG
-Processing clinical data
-Finished in 1.911s elapsed (2.470s cpu) 
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/59798758

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档