R

Edge coloring with user data

Coloring edges in ggtree is quite easy, as we can map the color to numerical or categorical values via the aes(color=VAR) syntax. For user’s own data, it is also easy as ggtree provide the %<+% operator to attach user data.

But as it seems not so obviously for ggtree users, see question 1, 2, and 3, I will demonstrate how to color edges using user data here.

scatterpie for plotting pies on ggplot

Plotting pies on ggplot/ggmap is not an easy task, as ggplot2 doesn’t provide native pie geom. The pie we produced in ggplot2 is actually a barplot transform to polar coordination. This make it difficult if we want to produce a map like the above screenshot, which was posted by Tyler Rinker, the author of R package pacman.

align genomic features with phylogenetic tree

A question on biostars asking how to generate the following figure:

This can be quite easy to implement in ggtree, I can write a geom layer to layout the alignment. As ggbio already provides many geom for genomic data and I don’t want to re-invent the wheel, I decided to try ggtree+ggbio. This is also the beauty of R that packages complete each others.

showCategory parameter for visualizing compareCluster output

I am using dotplot() to visualize results from enrichGO(), enrichDO(), enricher() and compareCluster() in clusterProfiler R package. When specifying showCategory, I get the right number of categories except with the results of compareCluser().

In my case, I use compareCluster() on a list of 3 elements:

str(ClusterList)
List of 3
 $ All : chr [1:1450] "89886" "29923" "100132891" "101410536" ...
 $ g1  : chr [1:858] "89886" "29923" "100132891" "101410536" ...
 $ g2: chr [1:592] "5325" "170691" "29953" "283392" ...
CompareGO_BP=compareCluster(ClusterList, fun="enrichGO", pvalueCutoff=0.01, pAdjustMethod="BH", OrgDb=org.Hs.eg.db,ont="BP",readable=T)

dotplot(CompareGO_BP, showCategory=10, title="GO - Biological Process")

I ask for 10 categories, but I get 15 categories in All, 8 categories in g1 and 12 categories in g2. None of the categories, neither the sum of the categories are 10…

Is the option showCategory working in the case of comparison? Am I missing something here?

And which categories precisely will it plot? the most significant whatever my 3 cases or the most significant of each case?

The question was posted in Bioconductor support site. It seems quite confusing and I think I need to write a post to clarify it.

xlim_tree: set x axis limits for only Tree panel

A ggtree user recently asked me the following question in google group:

I try to plot long tip labels in ggtree and usually adjust them using xlim(), however when creating a facet_plot xlim affects all plots and minimizes them.

Is it possible to work around this and only affect the tree and it’s tip labels leaving the other plots in facet_plot unaffected?

This is indeed a desire feature, as ggplot2 can’t automatically adjust xlim for text since the units are in two different spaces (data and pixel).

facet_plot: a general solution to associate data with phylogenetic tree

ggtree provides gheatmap for visualizing heatmap and msaplot for visualizing multiple sequence alignment with phylogenetic tree.

We may have different data types and want to visualize and align them with the tree. For example, dotplot of SNP site (e.g. using geom_point(shape='|')), barplot of trait values (e.g. using geom_barh(stat='identity')) et al.

To make it easy to associate different types of data with phylogenetic tree, I implemented the facet_plot function which accepts a geom function to draw the input data.frame and display it in an additional panel.

[Bioc 34] NEWS of my BioC packages

I have 7 packages published within the Bioconductor project.

A new package meshes was included in BioC 3.4 release.

reproducible logo generated by ggtree

ggtree provides many helper functions for manupulating phylogenetic trees and make it easy to explore tree structure visually.

Here, as examples, I used ggtree to draw capital character G and C, which are first letter of my name :-).

To draw a tree in such shape, we need fan layout (circular layout with open angle) and then rotating the tree to let the open space on the correct position. Here are the source codes to produce the G and C shapes of tree. I am thinking about using the G shaped tree as ggtree logo. Have fun with ggtree :-)

ggtree for outbreak data

OutbreakTools implements basic tools for the analysis of Disease Outbreaks.

It defines S4 class obkData to store case-base outbreak data. It also provides a function, plotggphy, to visualize such data on the phylogenetic tree.

library(OutbreakTools)
data(FluH1N1pdm2009)
attach(FluH1N1pdm2009)


x <- new("obkData", individuals = individuals, dna = FluH1N1pdm2009$dna,
         dna.individualID = samples$individualID, dna.date = samples$date,
         trees = FluH1N1pdm2009$trees)

plotggphy(x, ladderize = TRUE, branch.unit = "year",
          tip.color = "location", tip.size = 3, tip.alpha = 0.75)

ggtree for microbiome data

ggtree can parse many software outputs and the evolution evidences inferred by these software can be used directly for tree annotation. ggtree not only works as an infrastructure that enables evolutionary data that inferred by commonly used software packages to be used in R, but also serves as a general tree visualization and annotation tool for the R community as it supports many S3/S4 objects defined by other R packages.

phyloseq for microbiome data

phyloseq class defined in the phyloseq package was designed for microbiome data. phyloseq package implemented plot_tree function using ggplot2. Although the function was implemented by ggplot2 and we can use theme, scale_color_manual etc for customization, the most valuable part of ggplot2, adding layer, is missing. plot_tree only provides limited parameters to control the output graph and it is hard to add layer unless user has expertise in both phyloseq and ggplot2.