This lesson demonstrates how to use ggtree, an extension of the ggplot2 package to visualize phylogenetic trees.
ggtree
packageAfter the tree was loaded into R, we can use ggtree to visualize the tree. ggtree supports tree imported by generated by most of the other R packages, including phylo
and multiPhylo
objects defined in ape, phylo4
and phylo4d
objects defined in phylobase, phyloseq
object defined in phyloseq, obkdata
object defined in OutbreakTools and treedata
object defined in treeio.
Let’s load the library and then import the tree using read.tree
.
library(ggtree)
tree <- read.tree("data/tree.nwk")
tree
##
## Phylogenetic tree with 28 tips and 26 internal nodes.
##
## Tip labels:
## Phy000G05U_EMENI, Phy000GDP6_ASPNG, Phy003AMS0_602072, Phy000FJDH_ASPFL, Phy000FCLK_ASPCL, Phy000FQ5O_ASPFU, ...
## Node labels:
## , 0.99985, 0.99985, 0.72129, 0.991353, 0.99985, ...
##
## Unrooted; includes branch lengths.
Displaying the tree structure is easy, just use ggtree
function. The ggtree()
function is essentially a shortcut for ggplot() + geom_tree() + theme_tree()
.
# ggplot(tree) + geom_tree() + theme_tree()
ggtree(tree)
There’s also the treescale geom, which adds a scale bar, or alternatively, you can change the default ggtree()
theme to theme_tree2()
, which adds a scale on the x-axis. The horizontal dimension in this plot shows the amount of genetic change, and the branches and represent evolutionary lineages changing over time. The longer the branch in the horizonal dimension, the larger the amount of change, and the scale tells you this. The units of branch length are usually nucleotide substitutions per site – that is, the number of changes or substitutions divided by the length of the sequence (alternatively, it could represent the percent change, i.e., the number of changes per 100 bases).
# add a scale
ggtree(tree) + geom_treescale()
# or add the entire scale to the x axis with theme_tree2()
ggtree(tree) + theme_tree2()
The default is to plot a phylogram, where the x-axis shows the genetic change / evolutionary distance. If you want to disable scaling and produce a cladogram instead, set the branch.length="none"
option inside the ggtree()
call. See ?ggtree
for more.
ggtree(tree, branch.length="none")
We can change the line type, size, and color or mapping numerical/categorical variables to scale these visual characteristics (we’ll get to that later).
ggtree(tree, linetype='dashed', color='purple', size=2) + theme_tree('black')
To display taxa label, we can add a layer using + geom_tiplab()
(xlim()
here is to allocate more space, otherwise the labels may be truncated).
ggtree(tree) + geom_tiplab() + xlim(0, 5)
tree
by coloring branches using branch lengthes.+geom_nodelab()
ggtree supports several layouts, including:
for phylogram (by default) and cladogram if user explicitly setting branch.length='none'
or missing branch length in the tree, time-scaled, two-dimensional and unrooted tree layouts are also supported.
## circular layout
ggtree(tree, layout = 'circular')
## unrooted layout using 'daylight' algorithm
ggtree(tree, layout='daylight')