R

[Bioc 3.5] NEWS of my BioC packages

I have 8 packages published within the Bioconductor project.

A new package treeio was included in BioC 3.5 release.

Phylomoji with ggtree and emojifont

With ggtree (Yu et al. 2017), it is very easy to create phylomoji. Emoji is internally supported by ggtree.

Use emoji in taxa labels

library(ggtree)
tree_text <- "(((((cow, (whale, dolphin)), (pig2, boar)), camel), fish), seedling);"
x <- read.tree(text=tree_text)
ggtree(x, linetype="dashed", color='firebrick') +
    xlim(NA, 7) + ylim(NA, 8.5) +
    geom_tiplab(aes(color=label), parse='emoji', size=14, vjust=0.25) +
    labs(title="phylomoji", caption="powered by ggtree + emojifont")

convert graphic object to tree object using treeio

I have splitted ggtree to 2 packages, treeio and ggtree. Now ggtree is mainly focus on visualization and annotation, while treeio focus on parsing and exporting tree files. Here is a welcome message from treeio that you can convert ggtree output to tree object which can be exported as newick or nexus file if you want.

Thanks to ggplot2, output of ggtree is actually a ggplot object. The ggtree object can be rendered as graph by print method, but internally it is an object that contains data. treeio defines as.phylo and as.treedata to convert ggtree object to phylo or treedata object.

dotplot for GSEA result

For GSEA analysis, we are familar with the above figure which shows the running enrichment score. But for most of the software, it lack of visualization method to summarize the whole enrichment result.

In DOSE (and related tools including clusterProfiler, ReactomePA and meshes), we provide enrichMap and cnetplot to summarize GSEA result.

add layer to specific panel of facet_plot output

This is a question from ggtree google group:

Dear ggtree team,

how can I apply a geom_xxx to only one facet panel? For example if i want to get geom_hline(yintersect=1:30) or a geom_text() in the dot panel? I cant see the facet_grid(. ~ var) function call, so I don’t know which subsetting to use. I have already read http://stackoverflow.com/questions/29873155/geom-text-and-facets-not-working

  tr <- rtree(30)
  
  d1 <- data.frame(id=tr$tip.label, val=rnorm(30, sd=3))
  p <- ggtree(tr)
  
  p2 <- facet_plot(p, panel="dot", data=d1, geom=geom_point, aes(x=val), color='firebrick')
  d2 <- data.frame(id=tr$tip.label, value = abs(rnorm(30, mean=100, sd=50)))
  
  p3 <- facet_plot(p2, panel='bar', data=d2, geom=geom_segment, aes(x=0, xend=value, y=y, yend=y), size=3, color='steelblue') + theme_tree2()

Thanks! Andreas

If this can be done, we can create even more comprehensive tree plots.

ggtree version of plotTree

PLOTTING TREES + DATA is difficult. @DrKatHolt developed plotTree (R and Python scripts) to visualize associated data with trees, e.g. heatmap, horizontal bar etc.

I reproduced the examples presented in the plotTree repo using ggtree. Source code is freely available in https://github.com/GuangchuangYu/plotTree-ggtree.

Here are the outputs produced by ggtree:

Edge coloring with user data

Coloring edges in ggtree is quite easy, as we can map the color to numerical or categorical values via the aes(color=VAR) syntax. For user’s own data, it is also easy as ggtree provide the %<+% operator to attach user data.

But as it seems not so obviously for ggtree users, see question 1, 2, and 3, I will demonstrate how to color edges using user data here.

scatterpie for plotting pies on ggplot

Plotting pies on ggplot/ggmap is not an easy task, as ggplot2 doesn’t provide native pie geom. The pie we produced in ggplot2 is actually a barplot transform to polar coordination. This make it difficult if we want to produce a map like the above screenshot, which was posted by Tyler Rinker, the author of R package pacman.

align genomic features with phylogenetic tree

A question on biostars asking how to generate the following figure:

This can be quite easy to implement in ggtree, I can write a geom layer to layout the alignment. As ggbio already provides many geom for genomic data and I don’t want to re-invent the wheel, I decided to try ggtree+ggbio. This is also the beauty of R that packages complete each others.

showCategory parameter for visualizing compareCluster output

I am using dotplot() to visualize results from enrichGO(), enrichDO(), enricher() and compareCluster() in clusterProfiler R package. When specifying showCategory, I get the right number of categories except with the results of compareCluser().

In my case, I use compareCluster() on a list of 3 elements:

str(ClusterList)
List of 3
 $ All : chr [1:1450] "89886" "29923" "100132891" "101410536" ...
 $ g1  : chr [1:858] "89886" "29923" "100132891" "101410536" ...
 $ g2: chr [1:592] "5325" "170691" "29953" "283392" ...
CompareGO_BP=compareCluster(ClusterList, fun="enrichGO", pvalueCutoff=0.01, pAdjustMethod="BH", OrgDb=org.Hs.eg.db,ont="BP",readable=T)

dotplot(CompareGO_BP, showCategory=10, title="GO - Biological Process")

I ask for 10 categories, but I get 15 categories in All, 8 categories in g1 and 12 categories in g2. None of the categories, neither the sum of the categories are 10…

Is the option showCategory working in the case of comparison? Am I missing something here?

And which categories precisely will it plot? the most significant whatever my 3 cases or the most significant of each case?

The question was posted in Bioconductor support site. It seems quite confusing and I think I need to write a post to clarify it.