Ewan Birney最近的一篇博文(Five statistical things I wished I had been taught 20 years ago )讲述了统计对于生物学的重要性。

一开始从RA Fisher讲起,说生物压根就是统计。Fisher是个农业学家,他所建立的那些统计方法,都是从生物学问题出发。

Ewan所谈及的五个方面分别是:

1. Non parametric statistics. These are statistical tests which make a bare minimum of assumptions of underlying distributions; in biology we are rarely confident that we know the underlying distribution, and hand waving about central limit theorem can only get you so far. Wherever possible you should use a non parameteric test. This is Mann-Whitney (or Wilcoxon if you prefer) for testing “medians” (Medians is in quotes because this is not quite true. They test something which is closely related to the median) of two distributions, Spearman’s Rho (rather pearson’s r2) for correlation, and the Kruskal test rather than ANOVAs (though if I get this right, you can’t in Kruskal do the more sophisticated nested models you can do with ANOVA). Finally, don’t forget the rather wonderful Kolmogorov-Smirnov (I always think it sounds like really good vodka) test of whether two sets of observations come from the same distribution. All of these methods have a basic theme of doing things on the rank of items in a distribution, not the actual level. So - if in doubt, do things on the rank of metric, rather than the metric itself.

Continue reading

Author's picture

Guangchuang Yu

Bioinformatics Professor @ SMU

Bioinformatics Professor

Guangzhou