I am a postdoc researcher in School of Public Health at The University of Hong Kong. I am broadly interested in bioinformatics, metagenomics and molecular evolution especially phylogenetic dynamics of influenza A virus.
I love the R programming language and have contributed 9 Bioconductor packages to the community.
PhD in Molecular Evolution, 2017
The University of Hong Kong
Master in Biochemistry and Molecular Biology, 2009
Anhui Medical University
BSc in Biotechnology, 2005
South China Agricultural University
Sun, Jan 1, 2017, Hugo Academic Theme Conference
Reassortment is an important strategy for influenza A viruses to introduce a HA subtype that is new to human populations, which creates the possibilities of pandemic.
A diagram showed above (Figure 2 of
doi:10.1038/srep25549) is widely
used to illustrate the reassortment events. While such diagrams are
mostly manually draw and edit without software tool to automatically
generate. Here, I implemented the
hybrid_plot function for producing
publication quality figure of reassortment events.
library(tibble) library(ggplot2) n <- 8 virus_info <- tibble( id = 1:7, x = c(rep(1990, 4), rep(2000, 2), 2009), y = c(1,2,3,5, 1.5, 3, 4), segment_color = list( rep('purple', n), rep('red', n), rep('darkgreen', n), rep('lightgreen', n), c('darkgreen', 'darkgreen', 'red', 'darkgreen', 'red', 'purple', 'red', 'purple'), c('darkgreen', 'darkgreen', 'red', 'darkgreen', 'darkgreen', 'purple', 'red', 'purple'), c('darkgreen', 'lightgreen', 'lightgreen', 'darkgreen', 'darkgreen', 'purple', 'red', 'purple')) ) flow_info <- tibble(from = c(1,2,3,3,4,5,6), to = c(5,5,5,6,7,6,7)) hybrid_plot(virus_info, flow_info)
My friend who doing his PhD study at Johns Hopkins just send me the link about a SR paper of plagiarism. I have very similar experence of a paper published on BMC Systems Biology, which plagiarized my work and the editor just decided to publish an erratum.
Deng etc. published an R package, ppiPre, that copied source code of my package, GOSemSim, and pretended that they developed these algorithms by themselves in their paper.
Here is the screenshot of the source code (left: ppiPre, right: GOSemSim).
You can find out more on my blog post.
As a developer of several open source software, I am glad that someone find my source code useful and happy if someone use my source code to make something better. But I am not happy if someone copies my source code by removing author information and changing function names to pretend the code was developed by himself. The situation is even worse in academic. Taking someone else’s works and passing it off as one’s own is definitely plagiarism and not allow in academic.
This package provides functions for pathway analysis based on REACTOME pathway database. It implements enrichment analysis, gene set enrichment analysis and several functions for visualization.
The ggtree package extending the ggplot2 package. It based on grammar of graphics and takes all the good parts of ggplot2. ggtree is designed for not only viewing phylogenetic tree but also displaying annotation data on the tree.
This package implements functions to retrieve the nearest genes around the peak, annotate genomic region of the peak, statstical methods for estimate the significance of overlap among ChIP peak data sets, and incorporate GEO database for user to compare their own dataset with those deposited in database. The comparison can be used to infer cooperative regulation and thus can be used to generate hypotheses. Several visualization functions are implemented to summarize the coverage of the peak experiment, average profile and heatmap of peaks binding to TSS regions, genomic annotation, distance to TSS, and overlap of peaks or genes.
This package implements five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively for measuring semantic similarities among DO terms and gene products. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented for discovering disease associations of high-throughput biological data.
Pathway analysis based on REACTOME pathway database. It implements enrichment analysis, gene set enrichment analysis and several functions for visualization.
The clusterProfiler package implements methods to analyze and visualize functional profiles of genomic coordinates (supported by ChIPseeker), gene and gene clusters.
The semantic comparisons of Gene Ontology (GO) annotations provide quantitative ways to compute similarities between genes and gene groups, and have became important basis for many bioinformatics analysis approaches. GOSemSim is an R package for semantic similarity computation among GO terms, sets of GO terms, gene products and gene clusters. GOSemSim implemented five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively.
This is an example of using the custom widget to create your own homepage section.
I am a teaching instructor for the following courses at University X: