Visualization

This tutorial will allow you to practice making heat maps.

Heat map

First, let's focus on a small subset of variables: the Protein, Carbohydrate, Fat and Fiber content of "nutrimenu" recipes.

I created a compo table for you using the following command. Take a few moments to break down this command and understand how it works.

compo <- data.frame(
  nutrimenu[, c("Proteines", "Glucides", "Lipides", "Fibres")],
  row.names = nutrimenu$ID)

Now use the pheatmap function from thepheatmap package to make a heatmap of the compo table.

library(pheatmap)
pheatmap(compo)

Dendrograms

The heatmap itself consists of a rectangular image made up of cells colored according to a color scale shown on the right. This scale is very often a scale ranging from blue or white for small values to red for large values. The map is embellished on the side and at the top with two tree-shaped figures called dendrograms. They are the result of a method known as ascending hierarchical classification (CAH) which groups the rows and / or columns according to their similarity.

Remove the dendrograms with the cluster_rows andcluster_rows arguments.

pheatmap(compo, cluster_rows = FALSE, cluster_cols = FALSE)

Standardization

To represent very different variables or individuals, it is important to be able to transform the data to place it in a similar repository. This is the point of the scale argument, whose value is none by default, i.e. no transformation is applied.

Modify the scale argument of the following command and observe the result. Don't forget that you can either look at the function help for more details on this argument (?pheatmap), or you can use the automatic completion!

pheatmap(compo, cluster_rows = FALSE,
         cluster_cols = FALSE, show_rownames = FALSE,
         border_color = "black", cellwidth = 15,
         scale = "row")

Colors

As R is widely used for making graphical representations, there are a lot of ways to work with colors in graphs. We are going to see one way to modify the colors of a heat map, by working on its palette, with the excellent RColorBrewer package.

Caution, the RColorBrewer package is suitable for palettes for which we will be able to distinguish two different colors very well, this is why we must combine its functions with the colorRampPalette interpolation function.

I used below the command display.brewer.all which allows to represent all the color palettes available inRColorBrewer.

library(RColorBrewer)
display.brewer.all()

There are three types of palettes:

"sequential" palettes (above) which are adapted to scales of values going from a minimum (associated with a light tone) to a maximum (associated with a dark tone),
"qualitative" palettes (in the middle) which contain colors which are all easily differentiated from each other,
"divergent" palettes (at the bottom) which are adapted to the representation of scales of values which go from a minimum (dark) to a maximum (dark) while passing by an intermediate value (light).

We are going to transform the sequential YlOrRd palette (which therefore goes from yellow to red) which contains only 9 colors into a palette containing 50 colors thanks to the colorRampPalette function.

Test the code below which will generate the new palette.

mycolz <- brewer.pal(9, "YlOrRd")
colzRamp <- colorRampPalette(mycolz)(50)
barplot(rep(1, 50), col = colzRamp, border = NA, axes = F)

Complete the following code to use this new palette on the compo data!

mycolz <- brewer.pal(9, "YlOrRd")
colzRamp <- colorRampPalette(mycolz)(50)

pheatmap(compo)

pheatmap(compo, cluster_rows = FALSE,
         cluster_cols = FALSE, show_rownames = FALSE,
         border_color = "black", cellwidth = 15,
         scale = "column", color = colzRamp)

Add information

The heatmap is used to represent quantitative data, but very often we have in addition to this data qualitative data that describes the rows or columns of our data!

We are going to practice on a very simple case: representing the type of recipe and the nutriscore.

For this, I created a side table which has the same row names as the compo table and which contains a Type column and a Nutriscore column.

annot <- data.frame(
  Type = nutrimenu$Type, 
  Nutriscore = nutrimenu$Nutriscore,
  row.names = nutrimenu$ID)

The pheatmap function will then take as argument this table and add the information in the form of one or more lines of colors, with an additional legend. Modify the following command to make a nice graph!

pheatmap(compo, annotation_row = annot)

mycolz <- RColorBrewer::brewer.pal(9, "YlOrRd")
colzRamp <- colorRampPalette(mycolz)(50)

pheatmap(compo, cluster_rows = TRUE, 
         cluster_cols = FALSE, show_rownames = FALSE,
         cellwidth = 10, border_color = "black",
         scale = "column", color = colzRamp,
         annotation_row = annot)

Ascending hierarchical classification

Change the method performing the hierarchical ascending classification (of rows) by modifying the value of the arguments:

clustering_distance_rows: choose from "euclidean", "correlation", "maximum", "manhattan", "canberra", "binary" or "minkowski".
clustering_method: to choose from "ward.D", "ward.D2", "single", "complete", "average", "mcquitty", "median" or "centroid".

pheatmap(compo, 
         cluster_rows = TRUE, 
         cluster_cols = FALSE, 
         show_rownames = FALSE,
         cellwidth = 10, 
         border_color = "black",
         scale = "column")

pheatmap(compo, cluster_rows = TRUE, 
         cluster_cols = FALSE, show_rownames = FALSE,
         cellwidth = 10, border_color = "black",
         scale = "column", 
         clustering_distance_rows = "correlation",
         clustering_method = "ward.D2")