Visualization

pixel art of ggplot2 faceting using geofacet

I just discovered an interesting ggplot2 extension, geofacet, that supports arranging facet panels that mimics geographic topoloty.

After playing with it, I realized that it is not only for visualizing geo-related data, but also can be fun for presenting data to mimics pixel art.

Phylomoji with ggtree and emojifont

With ggtree (Yu et al. 2017), it is very easy to create phylomoji. Emoji is internally supported by ggtree.

Use emoji in taxa labels

library(ggtree)
tree_text <- "(((((cow, (whale, dolphin)), (pig2, boar)), camel), fish), seedling);"
x <- read.tree(text=tree_text)
ggtree(x, linetype="dashed", color='firebrick') +
    xlim(NA, 7) + ylim(NA, 8.5) +
    geom_tiplab(aes(color=label), parse='emoji', size=14, vjust=0.25) +
    labs(title="phylomoji", caption="powered by ggtree + emojifont")

add layer to specific panel of facet_plot output

This is a question from ggtree google group:

Dear ggtree team,

how can I apply a geom_xxx to only one facet panel? For example if i want to get geom_hline(yintersect=1:30) or a geom_text() in the dot panel? I cant see the facet_grid(. ~ var) function call, so I don’t know which subsetting to use. I have already read http://stackoverflow.com/questions/29873155/geom-text-and-facets-not-working

  tr <- rtree(30)
  
  d1 <- data.frame(id=tr$tip.label, val=rnorm(30, sd=3))
  p <- ggtree(tr)
  
  p2 <- facet_plot(p, panel="dot", data=d1, geom=geom_point, aes(x=val), color='firebrick')
  d2 <- data.frame(id=tr$tip.label, value = abs(rnorm(30, mean=100, sd=50)))
  
  p3 <- facet_plot(p2, panel='bar', data=d2, geom=geom_segment, aes(x=0, xend=value, y=y, yend=y), size=3, color='steelblue') + theme_tree2()

Thanks! Andreas

If this can be done, we can create even more comprehensive tree plots.

align genomic features with phylogenetic tree

A question on biostars asking how to generate the following figure:

This can be quite easy to implement in ggtree, I can write a geom layer to layout the alignment. As ggbio already provides many geom for genomic data and I don’t want to re-invent the wheel, I decided to try ggtree+ggbio. This is also the beauty of R that packages complete each others.

showCategory parameter for visualizing compareCluster output

I am using dotplot() to visualize results from enrichGO(), enrichDO(), enricher() and compareCluster() in clusterProfiler R package. When specifying showCategory, I get the right number of categories except with the results of compareCluser().

In my case, I use compareCluster() on a list of 3 elements:

str(ClusterList)
List of 3
 $ All : chr [1:1450] "89886" "29923" "100132891" "101410536" ...
 $ g1  : chr [1:858] "89886" "29923" "100132891" "101410536" ...
 $ g2: chr [1:592] "5325" "170691" "29953" "283392" ...
CompareGO_BP=compareCluster(ClusterList, fun="enrichGO", pvalueCutoff=0.01, pAdjustMethod="BH", OrgDb=org.Hs.eg.db,ont="BP",readable=T)

dotplot(CompareGO_BP, showCategory=10, title="GO - Biological Process")

I ask for 10 categories, but I get 15 categories in All, 8 categories in g1 and 12 categories in g2. None of the categories, neither the sum of the categories are 10…

Is the option showCategory working in the case of comparison? Am I missing something here?

And which categories precisely will it plot? the most significant whatever my 3 cases or the most significant of each case?

The question was posted in Bioconductor support site. It seems quite confusing and I think I need to write a post to clarify it.

xlim_tree: set x axis limits for only Tree panel

A ggtree user recently asked me the following question in google group:

I try to plot long tip labels in ggtree and usually adjust them using xlim(), however when creating a facet_plot xlim affects all plots and minimizes them.

Is it possible to work around this and only affect the tree and it’s tip labels leaving the other plots in facet_plot unaffected?

This is indeed a desire feature, as ggplot2 can’t automatically adjust xlim for text since the units are in two different spaces (data and pixel).

facet_plot: a general solution to associate data with phylogenetic tree

ggtree provides gheatmap for visualizing heatmap and msaplot for visualizing multiple sequence alignment with phylogenetic tree.

We may have different data types and want to visualize and align them with the tree. For example, dotplot of SNP site (e.g. using geom_point(shape='|')), barplot of trait values (e.g. using geom_barh(stat='identity')) et al.

To make it easy to associate different types of data with phylogenetic tree, I implemented the facet_plot function which accepts a geom function to draw the input data.frame and display it in an additional panel.

reproducible logo generated by ggtree

ggtree provides many helper functions for manupulating phylogenetic trees and make it easy to explore tree structure visually.

Here, as examples, I used ggtree to draw capital character G and C, which are first letter of my name :-).

To draw a tree in such shape, we need fan layout (circular layout with open angle) and then rotating the tree to let the open space on the correct position. Here are the source codes to produce the G and C shapes of tree. I am thinking about using the G shaped tree as ggtree logo. Have fun with ggtree :-)

ggtree for outbreak data

OutbreakTools implements basic tools for the analysis of Disease Outbreaks.

It defines S4 class obkData to store case-base outbreak data. It also provides a function, plotggphy, to visualize such data on the phylogenetic tree.

library(OutbreakTools)
data(FluH1N1pdm2009)
attach(FluH1N1pdm2009)


x <- new("obkData", individuals = individuals, dna = FluH1N1pdm2009$dna,
         dna.individualID = samples$individualID, dna.date = samples$date,
         trees = FluH1N1pdm2009$trees)

plotggphy(x, ladderize = TRUE, branch.unit = "year",
          tip.color = "location", tip.size = 3, tip.alpha = 0.75)

ggtree for microbiome data

ggtree can parse many software outputs and the evolution evidences inferred by these software can be used directly for tree annotation. ggtree not only works as an infrastructure that enables evolutionary data that inferred by commonly used software packages to be used in R, but also serves as a general tree visualization and annotation tool for the R community as it supports many S3/S4 objects defined by other R packages.

phyloseq for microbiome data

phyloseq class defined in the phyloseq package was designed for microbiome data. phyloseq package implemented plot_tree function using ggplot2. Although the function was implemented by ggplot2 and we can use theme, scale_color_manual etc for customization, the most valuable part of ggplot2, adding layer, is missing. plot_tree only provides limited parameters to control the output graph and it is hard to add layer unless user has expertise in both phyloseq and ggplot2.