數據聚類然后展示聚類熱圖是生物信息中組學數據分析的常用方法,在R語言中有很多函數可以實現,譬如heatmap,kmeans等,除此外還有一個用得比較多的就是heatmap.2。最近在網上看到一個筆記文章關于《一步一步學heatmap.2函數》,在此與大家分享。由于原作者不詳,暫未標記來源,請原作者前來認領哦,O(∩_∩)O哈哈~
數據如下:
- library(gplots)
- data(mtcars)
- x <- as.matrix(mtcars)
- rc <- rainbow(nrow(x), start=0, end=.3)
- cc <- rainbow(ncol(x), start=0, end=.3)
X就是一個矩陣,里面是我們需要畫熱圖的數據。
Rc是一個調色板,有32個顏色,漸進的
Cc也是一個調色板,有11個顏色,也是漸進的
首先畫一個默認的圖:
- heatmap.2(x)
然后可以把聚類數可以去掉:就是控制這個dendrogram參數
- heatmap.2(x, dendrogram=“none”)
然后我們控制一下聚類樹
- heatmap.2(x, dendrogram=“row”) # 只顯示行向量的聚類情況
- heatmap.2(x, dendrogram=“col”) #只顯示列向量的聚類情況
?
下面還是在調控聚類樹,但是我沒看懂跟上面的參數有啥子區別!
- heatmap.2(x, keysize=2) ## default - dendrogram plotted and reordering done.
- heatmap.2(x, Rowv=FALSE, dendrogram=“both”) ## generate warning!
- heatmap.2(x, Rowv=NULL, dendrogram=“both”) ## generate warning!
- heatmap.2(x, Colv=FALSE, dendrogram=“both”) ## generate warning!
接下來我們可以調控行列向量的label的字體大小方向
首先我們調控列向量,也就是x軸的label
- heatmap.2(x, srtCol=NULL)
- heatmap.2(x, srtCol=0, adjCol = c(0.5,1) )
- heatmap.2(x, srtCol=45, adjCol = c(1,1) )
- heatmap.2(x, srtCol=135, adjCol = c(1,0) )
- heatmap.2(x, srtCol=180, adjCol = c(0.5,0) )
- heatmap.2(x, srtCol=225, adjCol = c(0,0) ) ## not very useful
- heatmap.2(x, srtCol=270, adjCol = c(0,0.5) )
- heatmap.2(x, srtCol=315, adjCol = c(0,1) )
- heatmap.2(x, srtCol=360, adjCol = c(0.5,1) )
然后我們調控一下行向量,也就是y軸的label
- heatmap.2(x, srtRow=45, adjRow=c(0, 1) )
- heatmap.2(x, srtRow=45, adjRow=c(0, 1), srtCol=45, adjCol=c(1,1) )
- heatmap.2(x, srtRow=45, adjRow=c(0, 1), srtCol=270, adjCol=c(0,0.5) )
設置 offsetRow/offsetCol 可以把label跟熱圖隔開!
- ## Show effect of offsetRow/offsetCol (only works when srtRow/srtCol is
- ## not also present) heatmap.2(x, offsetRow=0, offsetCol=0)
- heatmap.2(x, offsetRow=1, offsetCol=1)
- heatmap.2(x, offsetRow=2, offsetCol=2)
- heatmap.2(x, offsetRow=-1, offsetCol=-1)
- heatmap.2(x, srtRow=0, srtCol=90, offsetRow=0, offsetCol=0)
- heatmap.2(x, srtRow=0, srtCol=90, offsetRow=1, offsetCol=1)
- heatmap.2(x, srtRow=0, srtCol=90, offsetRow=2, offsetCol=2)
- heatmap.2(x, srtRow=0, srtCol=90, offsetRow=-1, offsetCol=-1)
- ## Show effect of z-score scaling within columns, blue-red color scale
- hv <- heatmap.2(x, col=bluered, scale=“column”, tracecol=“#303030”)
hv是一個熱圖對象!!!
- > names(hv) # 可以看到hv對象里面有很多子對象
- > “rowInd” “colInd” “call” “colMeans” “colSDs” “carpet” “rowDendrogram” “colDendrogram” “breaks” “col” “vline” “colorTable” ## Show the mapping of z-score values to color bins hvKaTeX parse error: Expected 'EOF', got '#' at position 638: …an class="com">#?# Extract the r…colorTable[hvKaTeX parse error: Expected 'EOF', got '#' at position 124: …n class="str">"#?FFFFFF"</span><…colorTable[hvKaTeX parse error: Expected 'EOF', got '#' at position 124: …n class="str">"#?FFFFFF"</span><…colSDs + hv c o l M e a n s < / s p a n > < s p a n c l a s s = " p u n " > , < / s p a n > < s p a n c l a s s = " p l n " > w h i t e B i n < / s p a n > < s p a n c l a s s = " p u n " > [ < / s p a n > < s p a n c l a s s = " l i t " > 2 < / s p a n > < s p a n c l a s s = " p u n " > ] < / s p a n > < s p a n c l a s s = " p l n " > < / s p a n > < s p a n c l a s s = " p u n " > ? < / s p a n > < s p a n c l a s s = " p l n " > h v colMeans</span><span class="pun">,</span><span class="pln"> whiteBin</span><span class="pun">[</span><span class="lit">2</span><span class="pun">]</span><span class="pln"> </span><span class="pun">*</span><span class="pln"> hv colMeans</span><spanclass="pun">,</span><spanclass="pln">whiteBin</span><spanclass="pun">[</span><spanclass="lit">2</span><spanclass="pun">]</span><spanclass="pln"></span><spanclass="pun">?</span><spanclass="pln">hvcolSDs + hvKaTeX parse error: Expected 'EOF', got '#' at position 1148: …n class="str">"#?303030"</span><…Type)],
- xlab=‘CellLines’,
- ylab=‘Probes’,
- main=Cluster_Method[i],
- col=greenred(64))
- dev.off()
- }
?
這樣就可以一下子把七種cluster的方法依次用到heatmap上面來。而且通過對cluster樹的比較,我們可以從中挑選出最好、最穩定到cluster方法,為后續分析打好基礎!
?
對下面這個數據聚類:
- require(graphics)
- hc <- hclust(dist(USArrests), “ave”)
- plot(hc)
首先對一個數據框用dist函數處理得到一個dist對象!
Dist對象比較特殊,專門為hclust函數來畫聚類樹的!