搜索 - 腾讯云开发者社区-腾讯云

文章/答案/技术大牛

发布

来自专栏DeepHub IMBA
无监督学习的集成方法：相似性矩阵的聚类
fontsize=20) from sklearn.metrics import normalized_mutual_info_score, adjusted_rand_score def purity = np.max(contingency_matrix, axis=1)/np.sum(contingency_matrix, axis=1) print( f"Mean Class Purity } - {purity_vector[i]:.2f}" ) cluster_purity_vector = np.zeros( (n_clusters) ) cluster_purity_vector = np.max(contingency_matrix, axis=0)/np.sum(contingency_matrix, axis=0) print( f"Mean Cluster Purity - {np.mean(cluster_purity_vector):.2f}" ) for i in range(n_clusters): print( f" {i} - {cluster_purity_vector
65740编辑于 2023-11-13
来自专栏作图丫
基于甲基化评估肿瘤纯度R包-InfiniumPurify
肿瘤样本中癌细胞所占比例称为肿瘤纯度（tumor purity），是肿瘤样本的内在因素，在差异甲基化、亚克隆反卷积和亚型聚类等多种分析中具有潜在的重要影响。评估肿瘤纯度：getPurity data(beta.emp) #beta.emp是一个62例肿瘤和正常样本的甲基化beta值的数据框（1）评估一个肿瘤样本的纯度（无正常样本） purity <- ,K=3, maxiter=5, tol=0.001) #tumor.data肿瘤样本beta值 #purity纯度评估值 #k是聚类数目 #允许的最大迭代次数，默认值是100 #EM迭代收敛性的容忍度 ,purity = purity) 4. = purity) 输出：所有CpG位点(行)和肿瘤样本(列)的纯化后肿瘤的beta值矩阵小编总结：很多评估肿瘤纯度的方法都是基于基因表达或者拷贝数变异，而今天分享的InfiniumPurify
77121编辑于 2022-03-29
来自专栏作图丫
评估肿瘤纯度的方法（二）：基于单核苷酸变异 TPES
R包应用 01 TPES_purity计算样本纯度例：TCGA_A8_A0A7 (1)使用数据： TCGA_A8_A0A7_seg: TCGA_A8_A0A7样本的SEG文件，为数据框。 (2)计算纯度: TPES_purity(ID= "TCGA-A8-A0A7", SEGfile = TCGA_A8_A0A7_seg, SNVsReadCountsFile = sample：样本ID purity：TPES评估的样本纯度 purity.min：TPES评估样本的最小纯度 purity.max：TPES评估样本的最大纯度 n.segs：TPES使用的中性片段的拷贝数使用的假定克隆SNVs的等位基因分数值和TPES_purity计算的密度函数的图形报告。 TPES: tumor purity estimation from SNVs.
1.7K11编辑于 2022-03-29
来自专栏生信技能树
TCGA的28篇教程-早期泛癌研究
copy-number data (estimations were available for only 11 cancer types)7; LUMP (leukocytes unmethylation for purity 结果文件都是在：Tumor purity estimates for TCGA samples. Tumor purity estimates according to four methods and the consensus method for all TCGA samples with available 不同肿瘤纯度方法的归一化全称是：consensus measurement of purity estimations (CPE) 这里的归一化很简单， CPE is the median purity 然后作者通过分析发现 median purity levels and median mutational burden 具有非常好的相关性，如下： ?
3.9K31发布于 2018-09-21
来自专栏作图丫
评估肿瘤纯度的方法（四）：基于甲基化 LUMP和PAMAS
LUMP方法 LUMP (leukocytes unmethylation for purity)方法是在Systematic pan-cancer analysis of tumour purity这篇工作中介绍的下图为文章中展示每两种评估方法之间的相关性 PAMES方法 PAMES (Purity Assessment from clonal MEthylation Sites)，使用几十个高度克隆的肿瘤类型特异性计算纯度 purity <- compute_purity(tumor_toy_data, list(hyper=info_sites.hg38$hyper Systematic pan-cancer analysis of tumour purity. Tumor purity quantification by clonal DNA methylation signatures.
1.2K41编辑于 2022-03-29
来自专栏生物信息云
TCGA数据挖掘（四）：表达差异分析（4）
of the batches we can continue to work on the original gene matrix for DEA ### ###############Tumor purity filtering########### ###vector containing all TCGA barcodes that hhave 60% tumor purity or more Purity.BRCA <-TCGAtumor_purity(colnames(dataPrep1), 0, 0, 0, 0, 0.6)$pure_barcodes ################DEA with Molecular BRCA: #diff contains TCGA samples that have an available molecular subtype ###Also Applying Tumor purity criteria diff<-setdiff(Purity.BRCA, TCGA_MolecularSubtype(colnames(dataPrep[,dataSmTP_short]))$filtered
4.9K51发布于 2019-09-18
来自专栏云计算D1net
Pure Storage将闪存阵列引入OpenStack云
Pure公司的Cinder驱动程序拥有REST API并支持iSCSI，且“支持OpenStack Juno与Ice House版本外加Purity 0.E. 3.4.2及更新版本。” 这套工具包能够利用Python 2.6及更新版本实现Purity REST API的正常执行。” Pure Storage OpenStack Cinder驱动程序与Python自动化工具包目前已经正式发布，且完全兼容Purity 3.4及其全部次级版本。 Purity OE REST API的相关说明文档已经被包含于Pure的闪存阵列产品当中。
94450发布于 2018-03-20
来自专栏生物信息云
TCGAbiolinks包帮助文档（英文版）
TCGAquery_subtype Retrieve molecular subtypes for a given tumor TCGAtumor_purity Filters TCGA barcodes according to purity parameters TCGAvisualize_BarPlot Barplot of subtypes and clinical info in groups methylation or expression TCGA_MolecularSubtype Retrieve molecular subtypes for given TCGA barcodes Tumor.purity TCGA samples with their Tumor Purity measures UseRaw_afterFilter Use raw count from the DataPrep object
2K21发布于 2019-08-07
来自专栏优雅R
「Shell」结合 sed 与 mv 批量修改文件名
fn=%2FPCAWG%2Fconsensus_cnv%2Fconsensus.20170217.purity.ploidy.txt.gz download? somatic.cna.icgc.public.tar.gz Fconsensus.20170119.somatic.cna.tcga.public.tar.gz Fconsensus.20170217.purity.ploidy.txt.gz somatic.cna.icgc.public.tar.gz Fconsensus.20170119.somatic.cna.tcga.public.tar.gz Fconsensus.20170217.purity.ploidy.txt.gz somatic.cna.tcga.public.tar.gz Fpcawg_donor_clinical_August2016_v9.xlsx Fconsensus.20170217.purity.ploidy.txt.gz
1.8K20发布于 2020-07-02
来自专栏人工智能头条
如何评价无监督聚类算法
Purity方法 purity方法是极为简单的一种聚类评价方法，只需计算正确聚类的文档数占总文档数的比例： ? 其中Ω={ω1,ω2,...,ωk} 是聚类的集合ωk表示第k个聚类的集合。 purity方法的优势是方便计算，值在0～1之间，完全错误的聚类方法值为0，完全正确的方法值为1。同时，purity方法的缺点也很明显它无法对退化的聚类方法给出正确的评价，设想如果聚类算法把每篇文档单独聚成一类，那么算法认为所有文档都被正确分类，那么purity值为1！而这显然不是想要的结果。
2.3K20发布于 2018-07-20
来自专栏小明的数据分析笔记本
跟着Nat. Biotechnol.学作图:R语言ComplexHeatmap热图展示TCGA-BRCA数据集概况
12), labels = c( 'Primary tumor', 'Metastatic tumor', 'Adjacent normal'))) ### Purity H.purity <- ComplexHeatmap::Heatmap( rev(sample.info$purity_HTseq_FPKM), column_names_gp = grid: :gpar(fontsize = 12), cluster_rows = FALSE, name = 'Tumor purity score', col = viridis::plasma( + H.plate + H.tss + H.tissue + H.pam50.tcga + H.pam50.genefu + H.ls + H.purity
63610编辑于 2023-01-06
来自专栏生物信息云
TCGAbiolinks包下载TCGA数据进行表达差异分析-乳腺癌案例
filtering########### ###vector containing all TCGA barcodes that hhave 60% tumor purity or more # TCGAtumor_purity (colnames(dataPrep1), 0, 0, 0, 0, 0.6) # filtered 为被过滤的数据， pure_barcodes是我们要的数据 Purity.BRCA<-purityDATA BRCA: ##diff contains TCGA samples that have an available molecular subtype ###Also Applying Tumor purity 方法纯化后的数据与肿瘤组织中的数据(已去除异常值)的分子亚型数据进行setdiff运算 # 也就是将在Purity.BRCA肿瘤组织中都具有分子亚型的barcode返回，总共83个。 diff<-setdiff(Purity.BRCA,TCGA_MolecularSubtype(colnames(dataPrep[,dataSmTP_short]))$filtered) write.csv
18K106发布于 2019-08-07
来自专栏生信技能树
100篇泛癌研究文献解读之PhyloWGS算法的肿瘤内部异质性和基因组不稳定性
虽然PhyloWGS本来是为WGS测序数据设计的，但是软件作者说明了其在WES数据也可以使用，可以看到PhyloWGS算法得到的克隆数量这个指标是跟一些因素相关的，包括purity (B) inferred Column 10—Top scoring phylogenies mean number of subclonal mutations Column 11—TCGA curated tumor sample purity Column 12—ABSOLUTE inferred tumor sample purity.
1.4K21发布于 2019-05-20
来自专栏乐沙弥的世界
PL/SQL --> 包重载、初始化
PRAGMA RESTRICT_REFERENCES(min_sal, WNPS); END; --下面的代码创建使用纯度级别的包体 CREATE OR REPLACE PACKAGE BODY purity END; --创建包体后，收到了如下的错误信息，因为两个公共函数指定了纯度级别为WNPS，而且函数内的代码对变量进行了赋值 scott@ORCL> show errors package body purity ; Errors for PACKAGE BODY PURITY: LINE/COL ERROR -------- --------------------------------------- Subprogram 'MIN_SAL' violates its associated pragma --下面使用初始化包的方法来为变量赋值 CREATE OR REPLACE PACKAGE BODY purity ; scott@ORCL> exec :maxsal:=purity.maxsal; scott@ORCL> print minsal maxsal; MINSAL ---------- 800
94020发布于 2018-08-07
来自专栏科研菌
生信代码：数据预处理（TCGAbiolinks包）
PreprocessingOutput.png width 生成图片的宽度‍‍ height 生成图片的高度 datatype 描述RangedSummarizedExperiment 数据类型的字符串第五步：TCGAtumor_purity （）筛选肿瘤纯度大于60%的肿瘤barcodes # TCGAtumor_purity(barcodes, estimate, absolute, lump, ihc, cpe)，使用来自5种方法的5个估计值作为阈值对 0.6（cpe是派生的共识度量，是将所有方法的标准含量归一化后的均值纯度水平，以使它们具有相等的均值和标准差） #筛选肿瘤纯度大于等于60%的样本数据 purityDATA <- TCGAtumor_purity (colnames(dataPrep1), 0, 0, 0, 0, 0.6) # filtered 为被过滤的数据， pure_barcodes是我们要的肿瘤数据 Purity.LIHC<-purityDATA 第六步：将肿瘤表达矩阵与正常组织表达矩阵合并，进行基因注释 #获取肿瘤纯度大于60%的340个肿瘤组织样本+50个正常组织样本,共计390个样本 puried_data <-dataPrep2[,c(Purity.LIHC
7.8K76发布于 2021-01-12
来自专栏生信修炼手册
circos 可视化手册-links 篇
的优先级，当连线重叠时，优先级越高的越先显示; 在links中，外观上最需要调整的是曲线的弯曲程度,有3个参数控制曲线的弯曲程度： bezier_radius crest bezier_radius_purity bezier_radius_purity控制有效的bezier_radius，示意图如下 ? 除了上述的曲线外，links还提供了ribbon的展示形式，用法如下 ? 生成的效果图： ?
1.3K30发布于 2020-05-11
来自专栏hsdoifh biuwedsy
Classification and regression techniques: decision tree and knn
class distribution are preferred Entropy: measure of node impurity Low entropy = low uncertainty & high purity High entropy = high uncertainty & low purity Measures homogeneity of a node Maximum -> records are equally an impurity measure The set {A,B,C,A,A,A,A,A} has low entropy: low uncertainty and high purity The set {A,B,C,D,B,E,A,F} has high entropy: high uncertainty and low purity -appreciate
65220发布于 2021-05-19
来自专栏芒果先生聊生信
复现生信论文36.免疫浸润分析
在相关性模块下，选择肿瘤类型，输入基因名称，待分析的基因，并选择相关性矫正选项（None或者Tumor purity)。 ? 在None，无矫正的条件下，得出cor数值和p值。 ? 在tumor purity矫正的条件下，得出cor数值和p值。 ? 利用word表格形式，编辑相关性分析的结果（数据太多，PPT放不开）。
4K21发布于 2020-07-20
来自专栏罗西的思考
Alink漫谈(二十二) ：源码分析之聚类评估
Params map(LongMatrix value) { // 使用 extractParamsFromConfusionMatrix 根据混淆矩阵计算 purity matrix)); } } 5.2 extractParamsFromConfusionMatrix extractParamsFromConfusionMatrix 这里就是根据混淆矩阵计算 purity predictLabel[i] / actualLabel[j])); tp += combination(matrix[i][j]); } purity += max; } purity /= total; mutualInfor /= Math.log(2); long fp = tpFpSum - tp; long , purity) .set(ClusterMetrics.RI, ri) .set(ClusterMetrics.ARI, (tp - expectedIndex) /
1.1K30发布于 2020-09-28
来自专栏生信技能树
使用ESTIMATE来对转录组表达数据根据stromal和immune细胞比例估算肿瘤纯度
"Inferring tumour purity and stromal and immune cell admixture from expression data." STromal and Immune cells in MAlignant Tumor tissues using Expression data) is a tool for predicting tumor purity represents the infiltration of immune cells in tumor tissue), and 3) estimate score (that infers tumor purity
7.1K20发布于 2018-07-27

第 2 页第 3 页第 4 页第 5 页第 6 页第 7 页第 8 页第 9 页第 10 页第 11 页

点击加载更多

无监督学习的集成方法：相似性矩阵的聚类

基于甲基化评估肿瘤纯度R包-InfiniumPurify

评估肿瘤纯度的方法（二）：基于单核苷酸变异 TPES

TCGA的28篇教程-早期泛癌研究

评估肿瘤纯度的方法（四）：基于甲基化 LUMP和PAMAS

TCGA数据挖掘（四）：表达差异分析（4）

Pure Storage将闪存阵列引入OpenStack云

TCGAbiolinks包帮助文档（英文版）

「Shell」结合 sed 与 mv 批量修改文件名

如何评价无监督聚类算法

跟着Nat. Biotechnol.学作图:R语言ComplexHeatmap热图展示TCGA-BRCA数据集概况

TCGAbiolinks包下载TCGA数据进行表达差异分析-乳腺癌案例

100篇泛癌研究文献解读之PhyloWGS算法的肿瘤内部异质性和基因组不稳定性

PL/SQL --> 包重载、初始化

生信代码：数据预处理（TCGAbiolinks包）

circos 可视化手册-links 篇

Classification and regression techniques: decision tree and knn

复现生信论文36.免疫浸润分析

Alink漫谈(二十二) ：源码分析之聚类评估

使用ESTIMATE来对转录组表达数据根据stromal和immune细胞比例估算肿瘤纯度

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐