我正在使用R执行聚类分析。
我有一个数据集,看起来像这样:
geneid S1 S2 S3 S4 M3 M4 M6
ENSRNOG00000000012 0.8032270364 1.5058909297 1.0496307677 1.4168397419 0.2750070475 0.9708536543 1.1570437101
ENSRNOG00000000021 3.0250287945 3.7782085764 3.4449320489 2.7004397181 3.2464080872 3.1795110503 2.9429835982
ENSRNOG00000000024 2.0669502439 2.5210507369 2.2555007331 1.7949356628 1.4382928516 1.9373443922 1.5210507369
ENSRNOG00000000033 2.7004397181 2.4724877715 2.1391420191 2.1309308698 1.8032270364 1.8757800631 1.7527485914
ENSRNOG00000000034 1.4541758932 1.3617683594 0.9963887464 0.7136958148 0.8718436485 0.6690267655 0.516015147
ENSRNOG00000000040 4.9420452599 5.0565835284 5.3527938294 4.8639384504 4.0891591319 4.2742616613 3.1731274335
ENSRNOG00000000041 2.6194130106 3.2637856139 3.4489009511 3.2032011563 3.7015490569 3.5410191531 3.0976107966
ENSRNOG00000000042 4.1263947376 4.6284819944 3.9731520379 3.014355293 3.0018022426 2.8972404256 2.5285713189
ENSRNOG00000000043 5.1051751923 5.7436226761 6.3211163506 6.5046203924 6.6071823374 6.2467880938 5.8371863852
ENSRNOG00000000044 3.2854022189 4.0465783666 4.1513717763 3.9250499647 4.5316933609 4.2727697324 3.7980505148
ENSRNOG00000000047 2.5248159284 1.8933622108 1.5210507369 1.0908534305 1.6229303509 1.9523335664 2.0976107966
ENSRNOG00000000048 3.5722833667 3.8569856898 3.8841094514 3.7202784652 4.2311251579 3.8399595875 3.6028844087
ENSRNOG00000000054 2.0823619696 2.6241008946 2.5058909297 1.3729520979 0.748461233 0.9927684308 0.8073549221
ENSRNOG00000000062 3.846994687 4.0609120496 4.1647058402 3.6644828404 3.6496154591 3.2957230245 3.1602748314
ENSRNOG00000000064 4.971543554 4.9993235782 5.1185258489 4.194559886 3.8639384504 4.2883585622 4.0531113365
ENSRNOG00000000066 3.2809563138 4.0413306068 4.0759604132 3.5422580498 3.7495342677 2.9411063109 2.6040713237
ENSRNOG00000000068 3.2986583156 3.5204222485 3.7436226761 3.3132458518 3.6427015718 3.4019034716 3.166715445
ENSRNOG00000000070 1.5235619561 2.266036894 2.2433644257 1.6229303509 2.1009776477 2.2630344058 1.9107326619
ENSRNOG00000000073 2.6780719051 2.9269482479 1.8559896973 1.3950627995 2.0426443374 2.266036894 1.9297909977
ENSRNOG00000000075 2.8559896973 2.9392265777 2.7235585615 2.2448870591 1.5109619193 1.8718436485 1.7092906357
ENSRNOG00000000081 4.8609627979 5.1501534552 5.7869883453 5.7993463875 5.6383635059 4.5478199566 4.2764966656
ENSRNOG00000000082 4.0018022426 4.1787146412 4.2067213574 3.5285713189 3.8063240574 4.0626398283 3.2913088598
ENSRNOG00000000091 0.7697717392 1.0036022367 0.867896464 0.5459683691 1.4541758932 1.8032270364 1.7311832416
ENSRNOG00000000095 3.5410191531 3.5348086612 3.9527994779 3.408711861 3.6028844087 3.0992952043 2.8011586561
ENSRNOG00000000096 1.4568061492 1.5655971759 1.6135316529 1.7527485914 1.4594316186 1.8559896973 1.673556424
ENSRNOG00000000098 2.414135533 3.5122268865 3.5147534984 3.3015876466 4.0755326312 3.8747969659 3.187451054
ENSRNOG00000000104 2.7125957804 2.5969351424 2.5459683691 1.3219280949 1.5849625007 1.6088092427 1.3161457423
ENSRNOG00000000105 1.6016965165 1.3015876466 1.1890338244 1.516015147 0.7570232465 0.6870606883 0.6040713237
ENSRNOG00000000108 3.2854022189 3.6976626335 3.8865501473 2.6369145804 2.6040713237 2.3923174228 1.8953026213
ENSRNOG00000000111 1.6229303509 2.09592442 2.0772429989 1.7782085764 1.673556424 0.9927684308 1.2570106182
ENSRNOG00000000112 2.2078928516 2.1826922975 2.4249220882 2.0250287945 2.1110313124 2.0635029423 1.8953026213
ENSRNOG00000000121 1.9202933002 2.5273206079 2.5741015081 2.2265085298 2.582556003 2.5753123307 2.1984941536
ENSRNOG00000000122 4.1255684518 4.4299506574 4.5071603491 4.2637856139 4.34269696 3.5849625007 3.9040023163
ENSRNOG00000000123 1.7070829918 1.9616233283 2.1127001327 1.4222330007 1.9221978484 1.9708536543 1.5801454844
ENSRNOG00000000127 2.3881895372 3.0347439493 2.9981955032 3.2295879227 4.0435194937 3.7729413378 3.2957230245
ENSRNOG00000000129 2.3074285252 2.979110755 3.1992797213 2.2203299549 3.6322682155 3.8982083525 3.5801454844
ENSRNOG00000000130 4.1622906135 4.7150696794 4.8733210629 3.9772799235 4.5849625007 4.9236246114 4.7739963251
ENSRNOG00000000133 3.2000648615 3.1168637577 3.1787146412 2.9579145986 2.7928553524 2.6780719051 2.2078928516
ENSRNOG00000000138 0.516015147 0.5993177937 1.0356239097 1.5849625007 2.2326607568 1.9745293125 2.0285691522
ENSRNOG00000000142 2.9278964537 2.3291235963 0.9671686075 1.4168397419 0.7048719645 1.9927684308 1.7224660245
ENSRNOG00000000145 3.2164548651 3.5490530293 3.4195388915 2.8797057663 2.3362833879 2.5849625007 2.6937657122
ENSRNOG00000000150 2.6380738372 2.9708536543 3.014355293 2.6870606883 2.6158870739 2.3161457423 2.4329594073
ENSRNOG00000000151 2.7125957804 3.5484366247 3.8354188405 4.5447326559 5.6959938131 5.3077927961 5.1941658685
ENSRNOG00000000155 3.0565835284 3.9354597478 3.6803243568 3.5134907456 3.8032270364 3.8865501473 3.2494453411
ENSRNOG00000000156 3.34269696 3.2772408983 1.7761039881 1.1505596766 0.5360529002 0.2750070475 0.3334237337
ENSRNOG00000000157 1.9164766444 2.1424134379 2.054848477 1.9145645235 2.2448870591 2.3305584 1.6599245584
ENSRNOG00000000161 1.7202784652 2.0772429989 1.9945797242 1.4541758932 1.7655347464 2.1602748314 1.8757800631
ENSRNOG00000000164 3.6616356023 4.2596491206 4.0635029423 3.2494453411 3.2418401836 3.1618876824 2.2295879227
ENSRNOG00000000165 1.3504972471 1.6158870739 0.9373443922 0.4541758932 0.7311832416 4.6392321632 4.5403993056
ENSRNOG00000000166 3.3441183345 3.3603642765 3.2494453411 1.9597701552 2.2357270598 3.1456774552 2.8698714062我正在做的是:
d=read.table("FPKM.1.SelectedSamples.txt", header=T, sep="\t", row.names=1)
dm=data.matrix(d)
log10.matrix <- log10(dm+1)
Z.log10.A.matrix <- t(scale(t(log10.matrix[idx,])))
tmp <- Z.log10.A.matrix[which(is.finite(Z.log10.A.matrix[,1])),]
length(which(!is.finite(tmp)))
fin.Z.log10.A.matrix <- tmp
set.seed(1)
km9.fin.Z.log.A.matrix <- kmeans(fin.Z.log10.A.matrix, 2, iter.max=40, nstart=10)
rowOrder <- names(sort(km9.fin.Z.log.A.matrix$cluster))
colorVector <- c("grey","purple")
clusterColors <- colorVector[ sort(km9.fin.Z.log.A.matrix$cluster)]
heatmap.2(fin.Z.log10.A.matrix[rowOrder,],trace="none",labRow=F,labCol=colnames(km9.fin.Z.log.A.matrix),col=hmcol,RowSideColors=clusterColors,Rowv=F,Colv=F,dendrogram="column",na.rm=T,main="Gene Expression")这些命令将为我提供一个包含两个集群的不错的热图。
现在,我如何提取这些集群的成员?
提前谢谢你。
发布于 2017-03-08 19:47:18
使用以下命令运行k- menas算法后:
km9.fin.Z.log.A.matrix <- kmeans(fin.Z.log10.A.matrix, 2, iter.max=40, nstart=10)您可以使用km9.fin.Z.log.A.matrix$cluster获取集群分配,其中每个样本都有一个引用到它所包含的集群的编号。
https://stackoverflow.com/questions/42669623
复制相似问题