我知道如何使用KEGG.db从KEGG网站上检索特定路径的基因列表,但是我找不到任何软件包可以在R中做同样的事情。我找到的唯一的注释包是KEGG.db,它只给出了KEGG中可用路径的列表。
http://www.kegg.jp/kegg/docs/keggapi.html通过输入路径ID并在KEGG上搜索它,就像这样寻找细胞周期基因:
http://rest.kegg.jp/get/hsa04110有谁知道在R/解决方案中有什么可以帮助我解决问题的包吗?
提前谢谢你,
发布于 2015-02-25 17:13:45
在重读你的问题后,我相信这是一个可以帮助你的R包。它是在生物导体,并允许您通过R和休息与KEGG互动。
克格雷斯特
KEGGREST:客户端对KEGG的REST访问 向KEGG服务器提供客户端接口的包。基于J. Zhang、R. Gentleman和Marc Carlson的KEGGSOAP,以及Aurelien Mazurie的KEGG (python包)。
发布于 2022-05-26 08:30:52
最近,我发现了两种获取KEGG通路及其基因的方法(其中一种方法使用了先前提议的包KEGGREST)。
FIRST WAY
library(limma)
library(AnnotationDbi)
library(org.Hs.eg.db)
# We get entrez ids and their pathways.
gene_pathways <- getGeneKEGGLinks(species="hsa")
# This is to get the gene symbols using entrez ids
gene_pathways$Symbol <- mapIds(org.Hs.eg.db, gene_pathways$GeneID,
column="SYMBOL", keytype="ENTREZID")
# pathway names
pathway_names <- getKEGGPathwayNames(species="hsa")
KEGG_pathways <- merge(gene_pathways, pathway_names, by="PathwayID")输出:
head(KEGG_pathways)
PathwayID GeneID Symbol Description
1 path:hsa00010 10327 AKR1A1 Glycolysis / Gluconeogenesis - Homo sapiens (human)
2 path:hsa00010 124 ADH1A Glycolysis / Gluconeogenesis - Homo sapiens (human)
3 path:hsa00010 125 ADH1B Glycolysis / Gluconeogenesis - Homo sapiens (human)
4 path:hsa00010 126 ADH1C Glycolysis / Gluconeogenesis - Homo sapiens (human)
5 path:hsa00010 127 ADH4 Glycolysis / Gluconeogenesis - Homo sapiens (human)第二道
library(KEGGREST)
library(org.Hs.eg.db)
library(tidyverse)
# get pathways and their entrez gene ids
hsa_path_entrez <- keggLink("pathway", "hsa") %>%
tibble(pathway = ., eg = sub("hsa:", "", names(.)))
# get gene symbols and ensembl ids using entrez gene ids
hsa_kegg_anno <- hsa_path_entrez %>%
mutate(
symbol = mapIds(org.Hs.eg.db, eg, "SYMBOL", "ENTREZID"),
ensembl = mapIds(org.Hs.eg.db, eg, "ENSEMBL", "ENTREZID")
)
# Pathway names
hsa_pathways <- keggList("pathway", "hsa") %>%
tibble(pathway = names(.), description = .)
KEGG_pathways <- left_join(hsa_kegg_anno, hsa_pathways)输出:
head(KEGG_pathways)
A tibble: 6 x 5
pathway eg symbol ensembl description
<chr> <chr> <chr> <chr> <chr>
1 path:hsa00010 10327 AKR1A1 ENSG00000117448 Glycolysis / Gluconeogenesis - Homo sapiens (human)
2 path:hsa00010 124 ADH1A ENSG00000187758 Glycolysis / Gluconeogenesis - Homo sapiens (human)
3 path:hsa00010 125 ADH1B ENSG00000196616 Glycolysis / Gluconeogenesis - Homo sapiens (human)
4 path:hsa00010 126 ADH1C ENSG00000248144 Glycolysis / Gluconeogenesis - Homo sapiens (human)
5 path:hsa00010 127 ADH4 ENSG00000198099 Glycolysis / Gluconeogenesis - Homo sapiens (human)如果出于某种原因,你需要查询其他物种,你只需替换"hsa“。使用这行代码keggList("organism"),您可以获得可用物种的列表。
org <- keggList("organism")
head(org)
T.number organism species phylogeny
[1,] "T01001" "hsa" "Homo sapiens (human)" "Eukaryotes;Animals;Vertebrates;Mammals"
[2,] "T01005" "ptr" "Pan troglodytes (chimpanzee)" "Eukaryotes;Animals;Vertebrates;Mammals"
[3,] "T02283" "pps" "Pan paniscus (bonobo)" "Eukaryotes;Animals;Vertebrates;Mammals"
[4,] "T02442" "ggo" "Gorilla gorilla gorilla (western lowland gorilla)" "Eukaryotes;Animals;Vertebrates;Mammals"
[5,] "T01416" "pon" "Pongo abelii (Sumatran orangutan)" "Eukaryotes;Animals;Vertebrates;Mammals"注意:虽然我使用org.Hs.eg.db来获取基因符号,但也可以从biomaRt获得它们。
library(biomaRt)
mart <- useDataset("hsapiens_gene_ensembl", useMart("ensembl"))
attributes <- listAttributes(mart)
genes <- getBM(attributes = c("hgnc_symbol", "entrezgene_id"),
mart = mart)有关KEGGREST的其他有用信息可以在格列奈特中找到。
https://stackoverflow.com/questions/28724674
复制相似问题