This tutorial will allow you to practice making heat maps.
First, let's focus on a small subset of variables: the Protein, Carbohydrate, Fat and Fiber content of "nutrimenu" recipes.
I created a compo
table for you using the following command. Take a few moments to break down this command and understand how it works.
compo <- data.frame(
nutrimenu[, c("Proteines", "Glucides", "Lipides", "Fibres")],
row.names = nutrimenu$ID)
Now use the pheatmap
function from thepheatmap
package to make a heatmap of the compo
table.
library(pheatmap)
pheatmap(compo)
The heatmap itself consists of a rectangular image made up of cells colored according to a color scale shown on the right. This scale is very often a scale ranging from blue or white for small values to red for large values. The map is embellished on the side and at the top with two tree-shaped figures called dendrograms. They are the result of a method known as ascending hierarchical classification (CAH) which groups the rows and / or columns according to their similarity.
Remove the dendrograms with the cluster_rows
andcluster_rows
arguments.
pheatmap(compo, cluster_rows = FALSE, cluster_cols = FALSE)
To represent very different variables or individuals, it is important to be able to transform the data to place it in a similar repository. This is the point of the scale
argument, whose value is none
by default, i.e. no transformation is applied.
Modify the scale
argument of the following command and observe the result. Don't forget that you can either look at the function help for more details on this argument (?pheatmap
), or you can use the automatic completion!
pheatmap(compo, cluster_rows = FALSE,
cluster_cols = FALSE, show_rownames = FALSE,
border_color = "black", cellwidth = 15,
scale = "row")
As R is widely used for making graphical representations, there are a lot of ways to work with colors in graphs. We are going to see one way to modify the colors of a heat map, by working on its palette, with the excellent RColorBrewer
package.
Caution, the
RColorBrewer
package is suitable for palettes for which we will be able to distinguish two different colors very well, this is why we must combine its functions with thecolorRampPalette
interpolation function.
I used below the command display.brewer.all
which allows to represent all the color palettes available inRColorBrewer
.
library(RColorBrewer)
display.brewer.all()
There are three types of palettes:
We are going to transform the sequential YlOrRd
palette (which therefore goes from yellow to red) which contains only 9 colors into a palette containing 50 colors thanks to the colorRampPalette
function.
Test the code below which will generate the new palette.
mycolz <- brewer.pal(9, "YlOrRd")
colzRamp <- colorRampPalette(mycolz)(50)
barplot(rep(1, 50), col = colzRamp, border = NA, axes = F)
Complete the following code to use this new palette on the compo
data!
mycolz <- brewer.pal(9, "YlOrRd")
colzRamp <- colorRampPalette(mycolz)(50)
pheatmap(compo)
pheatmap(compo, cluster_rows = FALSE,
cluster_cols = FALSE, show_rownames = FALSE,
border_color = "black", cellwidth = 15,
scale = "column", color = colzRamp)
The heatmap is used to represent quantitative data, but very often we have in addition to this data qualitative data that describes the rows or columns of our data!
We are going to practice on a very simple case: representing the type of recipe and the nutriscore.
For this, I created a side table which has the same row names as the compo
table and which contains a Type
column and a Nutriscore
column.
annot <- data.frame(
Type = nutrimenu$Type,
Nutriscore = nutrimenu$Nutriscore,
row.names = nutrimenu$ID)
The pheatmap
function will then take as argument this table and add the information in the form of one or more lines of colors, with an additional legend. Modify the following command to make a nice graph!
pheatmap(compo, annotation_row = annot)
mycolz <- RColorBrewer::brewer.pal(9, "YlOrRd")
colzRamp <- colorRampPalette(mycolz)(50)
pheatmap(compo, cluster_rows = TRUE,
cluster_cols = FALSE, show_rownames = FALSE,
cellwidth = 10, border_color = "black",
scale = "column", color = colzRamp,
annotation_row = annot)
Change the method performing the hierarchical ascending classification (of rows) by modifying the value of the arguments:
clustering_distance_rows
: choose from "euclidean"
, "correlation"
, "maximum"
, "manhattan"
, "canberra
", "binary"
or "minkowski"
.clustering_method
: to choose from "ward.D"
, "ward.D2"
, "single"
, "complete"
, "average"
, "mcquitty"
, "median"
or "centroid"
.pheatmap(compo,
cluster_rows = TRUE,
cluster_cols = FALSE,
show_rownames = FALSE,
cellwidth = 10,
border_color = "black",
scale = "column")
pheatmap(compo, cluster_rows = TRUE,
cluster_cols = FALSE, show_rownames = FALSE,
cellwidth = 10, border_color = "black",
scale = "column",
clustering_distance_rows = "correlation",
clustering_method = "ward.D2")