DNA序列差异作图

November 19, 2016 in Visualization, R

![](https://guangchuangyu.github.io/blog_images/2016/Screen Shot 2016-11-17 at 8.20.21 PM.png)

这是高老师文章里的图，每一个小图都是通过比较两条比对序列差异而做出来的，从图A和B中可以看出，CF_YL21有可能是Mont和Oz两条序列重组而来。高老师自己做出这图后，深知手工统计序列差异、画图、拼图之困难。问我有没有办法读两条序列，生成一个图。

blast一大坑：无法输出query sequence title

November 17, 2016 in Biology

通常做blast，我都是输出asn格式，然后用blast_formatter抽出需要的信息写成表格格式。

序列的description行当然不会有重复，但是ID (第一个空白前面的字符）是有重复的，从qseqid来看，就分不清是那个序列，于是准备让blast_formatter把query title（整个description line）给输出来，结果仔细阅读了blastn -help发现不支持呀不支持！然而它支持stitle和salltitle，也就是subject的可以，但query的不行，这绝逼是一个大坑！＿|￣|○

align genomic features with phylogenetic tree

November 15, 2016 in Visualization, R

A question on biostars asking how to generate the following figure:

This can be quite easy to implement in ggtree, I can write a geom layer to layout the alignment. As ggbio already provides many geom for genomic data and I don’t want to re-invent the wheel, I decided to try ggtree+ggbio. This is also the beauty of R that packages complete each others.

ggtree: reverse time scale

November 8, 2016 in Visualization, R

这是最近github上的问题，如何反转time scale，也就是把最近的时间设为0。而内部节点为距今多少年，这个很容易，把x轴的值变成x-max(x)就可以，感觉这个也只会在时间树上使用，所以取名revts（reverse time scale），revts之后，x轴最大值是0，其它为负数，可以使用scale_x_continuous改label为绝对值。

showCategory parameter for visualizing compareCluster output

November 3, 2016 in R, Visualization, Bioinformatics

I am using dotplot() to visualize results from enrichGO(), enrichDO(), enricher() and compareCluster() in clusterProfiler R package. When specifying showCategory, I get the right number of categories except with the results of compareCluser().

In my case, I use compareCluster() on a list of 3 elements:

str(ClusterList) List of 3 $ All : chr [1:1450] “89886” “29923” “100132891” “101410536” … $ g1 : chr [1:858] “89886” “29923” “100132891” “101410536” … $ g2: chr [1:592] “5325” “170691” “29953” “283392” …

CompareGO_BP=compareCluster(ClusterList, fun=“enrichGO”, pvalueCutoff=0.01, pAdjustMethod=“BH”, OrgDb=org.Hs.eg.db,ont=“BP”,readable=T)

dotplot(CompareGO_BP, showCategory=10, title=“GO - Biological Process”)

I ask for 10 categories, but I get 15 categories in All, 8 categories in g1 and 12 categories in g2. None of the categories, neither the sum of the categories are 10…

Is the option showCategory working in the case of comparison? Am I missing something here?

And which categories precisely will it plot? the most significant whatever my 3 cases or the most significant of each case?

The question was posted in Bioconductor support site. It seems quite confusing and I think I need to write a post to clarify it.

xlim_tree: set x axis limits for only Tree panel

October 31, 2016 in R, Visualization, Evolution

A ggtree user recently asked me the following question in google group:

I try to plot long tip labels in ggtree and usually adjust them using xlim(), however when creating a facet_plot xlim affects all plots and minimizes them.

Is it possible to work around this and only affect the tree and it’s tip labels leaving the other plots in facet_plot unaffected?

This is indeed a desire feature, as ggplot2 can’t automatically adjust xlim for text since the units are in two different spaces (data and pixel).

facet_plot: a general solution to associate data with phylogenetic tree

October 26, 2016 in R, Visualization, Evolution

ggtree provides gheatmap for visualizing heatmap and msaplot for visualizing multiple sequence alignment with phylogenetic tree.

We may have different data types and want to visualize and align them with the tree. For example, dotplot of SNP site (e.g. using geom_point(shape='|')), barplot of trait values (e.g. using geom_barh(stat='identity')) et al.

To make it easy to associate different types of data with phylogenetic tree, I implemented the facet_plot function which accepts a geom function to draw the input data.frame and display it in an additional panel.

DNA序列差异作图

blast一大坑：无法输出query sequence title

align genomic features with phylogenetic tree

ggtree: reverse time scale

showCategory parameter for visualizing compareCluster output

xlim_tree: set x axis limits for only Tree panel

facet_plot: a general solution to associate data with phylogenetic tree

Guangchuang Yu