# use simplify to remove redundancy of enriched GO terms

To simplify enriched GO result, we can use slim version of GO and use
*enricher*
function to analyze.

Another strategy is to use
GOSemSim to calculate
similarity of GO terms and remove those highly similar terms by keeping
one representative term. To make this feature available to
clusterProfiler
users, I develop a *simplify* method to reduce redundant GO terms from
output of *enrichGO* function.

```
require(clusterProfiler)
data(geneList, package="DOSE")
de <- names(geneList)[abs(geneList) > 2]
bp <- enrichGO(de, ont="BP")
enrichMap(bp)
```

The *enrichMap* doesn't display the whole picture as we use the default
value *n=50* to only show 50 highly significant terms. In the
*enrichMap*, we can found that there are many redundant terms form a
highly condense network.

Now with the *simplify* method, we can remove redundant terms.

```
bp2 <- simplify(bp, cutoff=0.7, by="p.adjust", select_fun=min)
```

The *simplify* method apply *‘select_fun’* (which can be a user defined
function) to feature ‘*by*’ to select one representative terms from
redundant terms (which have similarity higher than ‘*cutoff*').

The simplified version of enriched result is more clear and give us a more comprehensive view of the whole story.

*enrichGO* test the whole GO corpus and enriched result may contains
very general terms.
*clusterProfiler*
contains a *dropGO* function to remove specific GO terms or GO level,
see the
issue. With
*simplify* and *dropGO*, enriched result can be more specific and more
easy to interpret. Both of these functions work fine with outputs
obtained from both *enrichGO* and *compareCluster*.🍻

## Citation

* Yu G*, Wang L, Han Y and He Q*. clusterProfiler: an R package for comparing biological themes among gene clusters.

*. 2012, 16(5):284-287.*

**OMICS: A Journal of Integrative Biology**