dotplot支持使用formula指定x轴变量

April 8, 2018 in R, Visualization

使用barplot来展示富集分析结果是很常用的，而dotplot比较barplot来说，多了一个点大小的信息，可以比barplot展示多一个信息，所以是比较推荐的，我之前已经写了《dotplot展示富集分析结果》和《dotplot for GSEA》两篇文章，dotplot虽然简单，很多人会觉得会容易用ggplot2画出来，但其实有些细节，比如《为什么画出来的点比指定的数目要多？》，有些技巧，比如《搞大你的点，让我们画真正的气泡图》，是很多新手所不具备的，图虽然简单，但老司机的飚车技能也不可小看哦，所以我在《听说你也在画dotplot，但是我不服！》的文后就说了一句话：

clusterProfiler之所以好，因为真的考虑了很多细节！请放开那图，让clusterProfiler来画！

使用新姿势玩KEGG

January 11, 2018 in R

有小伙伴说他要用gage这个包，因为可以选择sigmet这个index，然后得到的结果只有signaling and metabolic pathways，而不会有他不关心的disease pathways。然而也有各种不爽，他最喜欢的还是clusterProfiler，但没办法只做某些pathways。

我发现大家对clusterProfiler有各种误解，各种觉得没办法，我也很无语啊，明明我写了大量的文档，你们偏不看。clusterProfiler啥都可以做，你想做COG，domain这些没有内置支持的富集分析都可以的，因为clusterProfiler是通用的分析工具，啥都能做。

说到gage的pathway index，这其实是他们对pathway有个分类，这个数据就在https://pathview.uncc.edu/data/khier.tsv可以下载到，要支持他还不容易，但我不喜欢把别人的东西打包在自己的包里，所谓走别人的路，让别人无路可走，这可不是什么好主意。所以呢，我不会内置支持的，你们自己玩。

掐架的额外收获

November 12, 2017 in R

《你昨天才做的分析，可能是几年前的结果！》这篇文章给大家敲了警钟，各种各样的web-server，要小心看有没有维护更新，有些是五年十年都不更新的，十分可怕。文章虽然讲的是富集分析，但其它分析工具你同样需要小心。

当然并不是说独立的软件/软件包就一定靠谱，如果软件自己打包了数据，同样要注意数据是否有更新，而如果数据不打包在软件里，而是在线获取，你同样也该留一下心。这也是clusterProfiler做富集分析的优势所在，KEGG数据是在线的，永远是最新的，而GO的数据不在软件包里，而依赖于别的数据包，而这些数据包是社区维护的（相对而言，个人的维护比较难以为继），就确保了数据一直在有更新维护的。

Bioconductor每半年发行一次，注释包同样每半年更新一次，所以你用clusterProfiler做GO分析，你用的GO数据库不会说超过半年没更新，而不像有些公司给出的结果，落后于这个世界不是一年两年这么简单。

听说你的KEGG分析有大量的基因没注释

November 7, 2017 in R

github上的问题，问了两个问题，这是其中第二个：

Meanwhile, when I fed the ENTREZID to enrichKEGG, it show me two unreasonable results:

kegg_enrich = enrichKEGG(
	gene = new_ids$ENTREZID,
	organism = 'hsa',
	keyType = 'ncbi-geneid',
	pvalueCutoff = 0.05,
	#pAdjustMethod = 'BH',
	#qvalueCutoff = 0.2,
	use_internal_data = FALSE
)

head(kegg_enrich)
              ID                           Description GeneRatio  BgRatio       pvalue   p.adjust     qvalue
hsa04380 hsa04380            Osteoclast differentiation    14/281 128/7299 0.0003817040 0.04767338 0.04508266
hsa04070 hsa04070 Phosphatidylinositol signaling system    12/281  99/7299 0.0003875885 0.04767338 0.04508266

I noted that only 281 genes are remained(there are 700+ genes in my list). In case that there is something wrong wtihin my gene list, I also tried my list with DAVID. It gave me reasonable results. So this is my second question, why enrichKEGG cannot recognize my geneids?

enrichGO出不来结果？没结果也是正确的结果

October 9, 2017 in R

Dear Dr. Guangchuang Yu, I write to you regarding a doubt concerning the enrichGO function from Clustalprofiler package. I have been used this package before, but now I’m using the same R script and I have an error message.

This is the command I use:

go.bp <- enrichGO(gene = gene.df$ENSEMBL, universe = universe.ENSEMBLID, keytype = ‘ENSEMBL’, OrgDb = org.Ce.eg.db, ont = ‘BP’, pAdjustMethod = ‘BH’, pvalueCutoff = 0.01, qvalueCutoff = 0.05, readable=T)

and the error is the following one:

No gene set have size > 10 … –> return NULL…

My input list is attached to this email (101 genes in total). When I use this list in a web resource such as GOrilla it gives to me the proper GO terms.

Thank you very much in advance. Best regards,

María

最近clusterProfiler用户的问题，这个问题还蛮普遍。这个我在《why clusterProfiler fails》中也有谈到，并不是能出结果就是好的。没有结果也是一种结果。

ko数据库ID转换

September 27, 2017 in R

Since the clusterProfiler is a very useful tools for GO and Kegg annotation.At present I want to use it to enrich for kegg result while only have the KO number ,So I want to convert the KO number to the pathway function,Is there have any function or methods in the software can convert it?any help will be appreciated

这个问题问说他想转KO到通路，首先这是一个常见的错误，很多人分不清K和ko，所以在我告诉他可以把K number转成ko pathway的时候，我先指出他的错误。

ko is actually pathway map. I think you are talking about K number mapping to ko pathway.

> bitr_kegg("K00844", "kegg", "Path", "ko")
     kegg    Path
1  K00844 ko00010
2  K00844 ko00051
3  K00844 ko00052
4  K00844 ko00500
5  K00844 ko00520
6  K00844 ko00521
7  K00844 ko00524
8  K00844 ko01100
9  K00844 ko01110
10 K00844 ko01120
11 K00844 ko01130
12 K00844 ko01200
13 K00844 ko04066
14 K00844 ko04910
15 K00844 ko04930
16 K00844 ko04973
17 K00844 ko05230

gseaplot自定义颜色

September 11, 2017 in R

《听说你有RNAseq数据却不知道怎么跑GSEA》一文有小伙伴问封面的gseaplot能否换颜色，于是我就随手支持了。