做科研我们不爽的是没有结果,比结果更不爽的是结果不符合预期,故事不知道该怎么编,然而戏真的不要太多,学术不是以预期来驱动的,请尊重一下事实,还有你的小鼠。

we had a quick email exchange about this and I would like to report this. (I appologise for contacting you directly, rather than following the submission guideline). I have a dataset with 6166 proteins of which 201 proteins are upregulated. The problem is that specifying the background proteins using universe argument in clusterProfiler, decreases the number of significant GO categories (28 significant categories without the universe argument and 5 with this argument). At the same time when I use online GORILLA tool I get a lot of categories, with the background:

http://cbl-gorilla.cs.technion.ac.il/GOrilla/kwpgie9l/GOResults.html

and not categories without the background:

http://cbl-gorilla.cs.technion.ac.il/GOrilla/5y4n3hn1/GOResults.html

I am not sure whether this is a bug or not, so I decided to report it here

The dataset is presented below …

> allProtUGO <- enrichGO(gene = upRegProt, OrgDb = org.Hs.eg.db,  keyType = "UNIPROT",  ont='ALL', pool = TRUE, 
                       qvalueCutoff = 0.05)

>identifiedProtUGO <- enrichGO(gene = upRegProt, OrgDb = org.Hs.eg.db,  keyType = "UNIPROT",  ont='ALL', pool = TRUE, 
                             qvalueCutoff = 0.05,  universe = backgroundProt)

>dim (allProtUGO@result)
>dim (identifiedProtUGO@result)
>

要看数据集地请点击此处直达github issue

最近很多群众真的是表达了对GOrilla深深的爱啊,上一次写《enrichGO出不来结果?没结果也是正确的结果》就顺道吐槽了一下。不知道各位的科研是以自己的期望驱动的?还是以科学事实驱动的?

把上次所说的GOrilla槽点再搬过来 (请移步:http://cbl-gorilla.cs.technion.ac.il/help.html):

Parameters

P-value threshold - Only GO terms with a p-value better than this threshold are reported. Note that this p-value does not include the multiple hypothesis correction on the number of tested GO terms. To correct for this the p-value should be multiplied by the number of GO terms used as reported in the results page.

GOrilla没有考虑multiple hypothesis correction, 然而clusterProfiler用了p值较正和qvalue。他只要设置 pAdjustMethod="none"qvalueCutoff = 1, 就可以和GOrilla进行PK了,同样clusterProfiler也会给出很多结果的。

然而还是那句话,你们的科研是以自己的预期来驱动的吗?给一堆结果你们就高兴了吗?现在google搜不出东西,它就告诉你搜不到,早期的google要是搜不到东西,就随机给出一些网页。只有有了自信的产品才敢跟无脑群众讲事实,而不是以满足幻想来留住你们。

allProtUGO <- enrichGO(gene = upRegProt, OrgDb = org.Hs.eg.db,  keyType = "UNIPROT",  ont='ALL', pool = TRUE, 
                       qvalueCutoff = 1, pAdjustMethod="none")

identifiedProtUGO <- enrichGO(gene = upRegProt, OrgDb = org.Hs.eg.db,  keyType = "UNIPROT",  ont='ALL', pool = TRUE, 
                             qvalueCutoff = 1, pAdjustMethod="none", universe = backgroundProt)

> dim (allProtUGO@result)
[1] 221  10
> dim (identifiedProtUGO@result)
[1] 129  10

最后《why clusterProfiler fails》这篇文章送给你们!