[连载3]：辣眼睛，一篇抄袭引发的系列血案！》，而这一篇文章中揭露了某讲师抄袭了两个R包，晋升副教授了！文章中还稍带了另外一个学生也是抄袭，当然也发表了SCI，也毕业了。

### gogadget.p.adjust

``````gogadget.p.adjust <- function(x, method = "BH") {
adjustedP <- p.adjust(x\$over_represented_pvalue, method = method)
x["adjusted_pvalue"] <- adjustedP
return(x)
}
``````

### gogadget.explore

``````gogadget.explore <- function(x , nmin = 10, stepmin = 10, nmax = 100, stepmax = 100, steps = 4, cutoff = 0.05, legend = "topleft") {
seqmin <- seq(nmin, nmin+stepmin*steps, by = stepmin)
seqmax <- seq(nmax, nmax+stepmax*steps, by = stepmax)
x.bar <- c()
x.name <- c()
for(nmin in seqmin) {
for(nmax in seqmax) {
x.filter <- gogadget.filter(x, cutoff = cutoff, nmin = nmin, nmax = nmax)
cat("Filtering with min.", nmin,"and max.", nmax, "genes results in:", length(x.filter\$category), "GO terms", "\n")
x.bar <- append(x.bar, length(x.filter\$category))
x.name <- append(x.name, nmin)
}
}
barplot(x.bar, ylab = "Number of GO terms", col=c(1:length(seqmax)), names = c(x.name), cex.names = 0.7,
legend = seqmax, args.legend = list(x = legend, bty = "n", title = "Max.", cex = 0.8), xlab = "Min.")
}
``````

### gogadget.filter

``````gogadget.filter <- function(x, cutoff = 0.05, nmin = 50, nmax = 250) {
x.sig <- x[x[, "adjusted_pvalue"] < cutoff,]
x.sig.filter <- x.sig[x.sig[, "numInCat"] >= nmin & x.sig[, "numInCat"] <= nmax,]
return(x.sig.filter)
}
``````

### gogadget.heatmap

``````gogadget.heatmap <- function(x, genes, genome, id, fetch.cats = c("GO:CC","GO:BP","GO:MF"), symm = T, trace = "none", density.info = "none", cexRow = 0.4, cexCol = 0.4, col = my_palette, margins = c(3, 11), revC = T, ...) {
allGOs <- stack(getgo(genes, genome, id, fetch.cats = fetch.cats))
b <- 1
name <- list()
name[1:length(x\$category)] <- paste("var", 1:length(x\$category), sep = "")
for(a in x\$category) {
new <- allGOs[allGOs["values"] == a,]
name[[b]] <- new\$ind
b <- b+1
}
m <- matrix(nrow = length(name), ncol = length(name))
for(i in 1:length(name)) {
for(j in 1:length(name)) {
overlap <- name[[i]] %in% name[[j]]
m[i,j] <- length(overlap[overlap == TRUE])
}
}
GOterms <- subset(x, category == x\$category, select = c(category, term))
colnames(m) <- GOterms\$category
mp <- cor(m)
rownames(mp) <- GOterms\$term
my_palette <- colorRampPalette(c("blue", "white", "red"))(n = 299)
heatmap.2(mp, symm = symm, trace = trace, density.info = density.info, cexRow = cexRow, cexCol = cexCol, col = col, margins = margins, revC = revC, ...)
mp <<- mp
}
``````

PS：这里犯了一个大忌，用了<<-。

### gogadget.cytoscape

``````gogadget.cytoscape <- function(x, file = "File4cytoscape.txt") {
x.cytoscape <- x[,c("category", "term", "over_represented_pvalue", "adjusted_pvalue")]
colnames(x.cytoscape) <- c("GO.ID", "Description", "p.Val", "FDR")
write.table(x.cytoscape, file = file, sep = "\t", row.names = F, quote = F)
cat(file,"is written to your working directory.\n")
}
``````

### gogadget.gmt

``````gogadget.gmt <- function(x, genes, genome, id, fetch.cats=c("GO:CC","GO:BP","GO:MF"), file = "cytoscape.gmt") {
allGOs <- stack(getgo(genes, genome, id, fetch.cats = fetch.cats))
newdf<-as.data.frame(cbind(x\$category,x\$term))
colnames(newdf)<-c("values","term")
reshaped <- group_by(allGOs, values) %>% summarise(genes = paste(ind, collapse = "\t"))
reshaped.terms <- left_join(reshaped, newdf, by = "values",copy = T)
reshaped.terms <- reshaped.terms[,c(1,3,2)]
write.table(reshaped.terms, file = file, sep = "\t", quote = FALSE, row.names = FALSE, col.names = FALSE)
}
``````

### 审稿吐槽

Some of gogadget’s functions were developed with help and advice from users of the Biostars (https://www.biostars.org/) and Bioconductor Support (https://support.bioconductor.org/) website.

``````Manuscript accepted:
02 December 2016
Manuscript received:
06 October 2016
``````

（鉴于文章已经发表了几年，审稿的保密性和匿名性已经不重要）我当时的审稿意见是：

This R package is hosted in sourceforge. Any published package should be hosted in either Bioconductor or CRAN as they will test and build the package (ensure it’s runnable in current R release) and more easy to install.

This gogadge package is quite tiny with only 66 lines of source code and its functions are quite simple without any innovation.

For simplifying result to help interpretation, there are several tools that employ GO semantic similarity measure to remove redundant terms such as REVIGO, clusterProfiler etc. Filter by number as defined in gogadget.filter is too simply and can’t resolve the redundant issue.

Heatmap and cytoscape’s enrichmap are good visualization tools for visualize related (redundant) terms but not directly solving the issue to simplify the result and help interpretation.

In my opinion, there is a better tool, RedundancyMiner, that can plot heatmap for GO terms. For enrichment map, this tool only generate txt file for cytoscape. Bioconductor package, clusterProfiler, can visualize enrichment map. It can simplify results by removing redundant terms using semantic similarity calculation followed by visualizing using enrichment map. All the functions provided by gogadget have better alternative in other software packages. This package is just too simple without innovation.

In addition, as there are many R packages that can perform GO enrichment analysis, such as GOstats, clusterProfiler, gage, geneAnswer, etc, we expect a package that designed for helping interpreting and visualizing result to work with other R packages to help users integrate their tools with existing pipeline. But unfortunately, gogadge only work with goseq.

### 代码靠网友

Some of gogadget’s functions were developed with help and advice from users of the Biostars (https://www.biostars.org/) and Bioconductor Support (https://support.bioconductor.org/) website.

``````gogadget.gmt <- function(x, genes, genome, id, fetch.cats=c("GO:CC","GO:BP","GO:MF"), file = "cytoscape.gmt") {
allGOs <- stack(getgo(genes, genome, id, fetch.cats = fetch.cats))
newdf<-as.data.frame(cbind(x\$category,x\$term))
colnames(newdf)<-c("values","term")
reshaped <- group_by(allGOs, values) %>% summarise(genes = paste(ind, collapse = "\t"))
reshaped.terms <- left_join(reshaped, newdf, by = "values",copy = T)
reshaped.terms <- reshaped.terms[,c(1,3,2)]
write.table(reshaped.terms, file = file, sep = "\t", quote = FALSE, row.names = FALSE, col.names = FALSE)
}
``````

``````reshaped <- group_by(sorted, values) %>% summarise(genes = paste(ind, collapse = "\t"))
added.terms <- left_join(reshaped, newdf, by = "values")
## this won't have to columns in the order you'd like, so change them
added.terms <- added.terms[,c(1,3,2)]
write.table(added.terms, file = "/tmp/tmp.txt", sep = "\t", quote = FALSE, row.names = FALSE, col.names = FALSE)
``````