proper use of GOSemSim
One day, I am looking for R packages that can analyze PPI and after searching, I found the ppiPre package in CRAN.
The function of this package is not impressive, and I already knew some related works, including http://intscore.molgen.mpg.de/. The authors of this webserver contacted me for the usages of GOSemSim when they developing it. What makes me curious is that the ppiPre package can calculate GO semantic similarity and supports 20 species exactly like GOSemSim. I opened the source tarball, and surprisingly found that its sources related to semantic similarity calculation are totally copied from GOSemSim.
GOSemSim was firstly released in 2008 Bioconductor 2.4 (at that time, devel version) and published in Bioinformatics in 2010. After compared the sources, I found the sources in ppiPre were copied from GOSemSim version 1.6.8 which released in 2010 Bioconductor 2.6. The Wang method defined in GOKEGGSims.r file of ppiPre is:
119 WangMethod <- function(GOID1, GOID2, ont="MF", organism="yeast") {
120 if(!exists("ppiPreEnv")) .initial()
121 weight.isa = 0.8
122 weight.partof = 0.6
123
124 if (GOID1 == GOID2)
125 return (1)
126
127 Parents.name <- switch(ont,
128 MF = "MFParents",
129 BP = "BPParents",
130 CC = "CCParents"
131 )
132 if (!exists(Parents.name, envir=ppiPreEnv)) {
133 GetGOParents(ont)
134 }
135 Parents <- get(Parents.name, envir=ppiPreEnv)
136
137 sv.a <- 1
138 sv.b <- 1
139 sw <- 1
140 names(sv.a) <- GOID1
141 names(sv.b) <- GOID2
142
143 sv.a <- WangSemVal(GOID1, ont, Parents, sv.a, sw, weight.isa, weight.partof)
144 sv.b <- WangSemVal(GOID2, ont, Parents, sv.b, sw, weight.isa, weight.partof)
145
146 sv.a <- uniqsv(sv.a)
147 sv.b <- uniqsv(sv.b)
148
149 idx <- intersect(names(sv.a), names(sv.b))
150 inter.sva <- unlist(sv.a[idx])
151 inter.svb <- unlist(sv.b[idx])
152 sim <- sum(inter.sva,inter.svb) / sum(sv.a, sv.b)
153 return(sim)
154 }
155 WangSemVal <- function(goid, ont, Parents, sv, w, weight.isa, weight.partof) {
156 if(!exists("ppiPreCache"))
157 return(WangSemVal_internal(goid, ont, Parents, sv, w, weight.isa, weight.partof))
158 goid.ont <- paste(goid, ont, sep=".")
159 if (!exists(goid.ont, envir=ppiPreCache)) {
160 value <- WangSemVal_internal(goid, ont, Parents, sv, w, weight.isa, weight.partof)
161 assign(eval(goid.ont), value, envir=ppiPreCache)
162 }
163 return(get(goid.ont, envir=ppiPreCache))
164 }
165
166 WangSemVal_internal <- function(goid, ont, Parents, sv, w, weight.isa, weight.partof) {
167 p <- Parents[goid]
168 p <- unlist(p[[1]])
169 if (length(p) == 0) {
170 return(0)
171 }
172 relations <- names(p)
173 old.w <- w
174 for (i in 1:length(p)) {
175 if (relations[i] == "is_a") {
176 w <- old.w * weight.isa
177 } else {
178 w <- old.w * weight.partof
179 }
180 names(w) <- p[i]
181 sv <- c(sv,w)
182 if (p[i] != "all") {
183 sv <- WangSemVal_internal(p[i], ont, Parents, sv, w, weight.isa, weight.partof)
184 }
185 }
186 return (sv)
187 }
188
189 uniqsv <- function(sv) {
190 sv <- unlist(sv)
191 una <- unique(names(sv))
192 sv <- unlist(sapply(una, function(x) {max(sv[names(sv)==x])}))
193 return (sv)
194 }
It is identical to the one I defined in GOSemSim 1.6.8:
196 ygcWangMethod <- function(GOID1, GOID2, ont="MF", organism="human") {
197 if(!exists("GOSemSimEnv")) .initial()
198 weight.isa = 0.8
199 weight.partof = 0.6
200
201 if (GOID1 == GOID2)
202 return (gosim=1)
203
204 Parents.name <- switch(ont,
205 MF = "MFParents",
206 BP = "BPParents",
207 CC = "CCParents"
208 )
209 if (!exists(Parents.name, envir=GOSemSimEnv)) {
210 ygcGetParents(ont)
211 }
212 Parents <- get(Parents.name, envir=GOSemSimEnv)
213
214 sv.a <- 1
215 sv.b <- 1
216 sw <- 1
217 names(sv.a) <- GOID1
218 names(sv.b) <- GOID2
219
220 sv.a <- ygcSemVal(GOID1, ont, Parents, sv.a, sw, weight.isa, weight.partof)
221 sv.b <- ygcSemVal(GOID2, ont, Parents, sv.b, sw, weight.isa, weight.partof)
222
223 sv.a <- uniqsv(sv.a)
224 sv.b <- uniqsv(sv.b)
225
226 idx <- intersect(names(sv.a), names(sv.b))
227 inter.sva <- unlist(sv.a[idx])
228 inter.svb <- unlist(sv.b[idx])
229 sim <- sum(inter.sva,inter.svb) / sum(sv.a, sv.b)
230 return(sim)
231 }
232
233
234
235 uniqsv <- function(sv) {
236 sv <- unlist(sv)
237 una <- unique(names(sv))
238 sv <- unlist(sapply(una, function(x) {max(sv[names(sv)==x])}))
239 return (sv)
240 }
241
242 ygcSemVal_internal <- function(goid, ont, Parents, sv, w, weight.isa, weight.partof) {
243 p <- Parents[goid]
244 p <- unlist(p[[1]])
245 if (length(p) == 0) {
246 #warning(goid, " may not belong to Ontology ", ont)
247 return(0)
248 }
249 relations <- names(p)
250 old.w <- w
251 for (i in 1:length(p)) {
252 if (relations[i] == "is_a") {
253 w <- old.w * weight.isa
254 } else {
255 w <- old.w * weight.partof
256 }
257 names(w) <- p[i]
258 sv <- c(sv,w)
259 if (p[i] != "all") {
260 sv <- ygcSemVal_internal(p[i], ont, Parents, sv, w, weight.isa, weight.partof)
261 }
262 }
263 return (sv)
264 }
265
266 ygcSemVal <- function(goid, ont, Parents, sv, w, weight.isa, weight.partof) {
267 if(!exists("GOSemSimCache")) return(ygcSemVal_internal(goid, ont, Parents, sv, w, weight.isa, weight.partof))
268 goid.ont <- paste(goid, ont, sep=".")
269 if (!exists(goid.ont, envir=GOSemSimCache)) {
270 value <- ygcSemVal_internal(goid, ont, Parents, sv, w, weight.isa, weight.partof)
271 assign(goid.ont, value, envir=GOSemSimCache)
272 #cat("recompute ", goid, value, "\n")
273 }
274 else{
275 #cat("cache ", goid, get(goid, envir=GOSemSimCache), "\n")
276 }
277 return(get(goid.ont, envir=GOSemSimCache))
278 }
The information content based method in ppiPre:
495 GetLatestCommonAncestor<-function(GOID1, GOID2, ont, organism){
496 #message("Calulating Latest Common Ancestor...")
497 if(!exists("ppiPreEnv")) .initial()
498
499 fname <- paste("Info_Contents", ont, organism, sep="_")
500 tryCatch(utils::data(list=fname, package="ppiPre", envir=ppiPreEnv))
501 InfoContents <- get("IC", envir=ppiPreEnv)
502
503 rootCount <- max(InfoContents[InfoContents != Inf])
504 InfoContents["all"] = 0
505 p1 <- InfoContents[GOID1]/rootCount
506 p2 <- InfoContents[GOID2]/rootCount
507 if(is.na(p1) || is.na(p2)) return (NA)
508 if (p1 == 0 || p2 == 0) return (NA)
509 Ancestor.name <- switch(ont,MF = "MFAncestors",BP = "BPAncestors",CC = "CCAncestors")
510 if (!exists(Ancestor.name, envir=ppiPreEnv)) {
511 TCSSGetAncestors(ont)
512 }
513
514 Ancestor <- get(Ancestor.name, envir=ppiPreEnv)
515 ancestor1 <- unlist(Ancestor[GOID1])
516 ancestor2 <- unlist(Ancestor[GOID2])
517 if (GOID1 == GOID2) {
518 commonAncestor <- GOID1
519 } else if (GOID1 %in% ancestor2) {
520 commonAncestor <- GOID1
521 } else if (GOID2 %in% ancestor1) {
522 commonAncestor <- GOID2
523 } else {
524 commonAncestor <- intersect(ancestor1, ancestor2)
525 }
526 if (length(commonAncestor) == 0)
527 LCA<-NULL
528 max<- -100
529 LCA<-NULL
530 for(a in commonAncestor){
531 if(!is.na(InfoContents[a])) {
532 if(InfoContents[a]>max){
533 max<-InfoContents[a]
534 LCA<-a
535 }
536 }
537 }
538 #message("done...")
539 return (LCA)
540
541 }
also identical to the one in GOSemSim 1.6.8:
280 `ygcInfoContentMethod` <- function(GOID1, GOID2, ont, measure, organism) {
281 if(!exists("GOSemSimEnv")) .initial()
282 fname <- paste("Info_Contents", ont, organism, sep="_")
283 tryCatch(utils::data(list=fname, package="GOSemSim", envir=GOSemSimEnv))
284 Info.contents <- get("IC", envir=GOSemSimEnv)
285
286 rootCount <- max(Info.contents[Info.contents != Inf])
287 Info.contents["all"] = 0
288 p1 <- Info.contents[GOID1]/rootCount
289 p2 <- Info.contents[GOID2]/rootCount
290
291 if (p1 == 0 || p2 == 0) return (NA)
292 Ancestor.name <- switch(ont,
293 MF = "MFAncestors",
294 BP = "BPAncestors",
295 CC = "CCAncestors"
296 )
297 if (!exists(Ancestor.name, envir=GOSemSimEnv)) {
298 ygcGetAncestors(ont)
299 }
300
301 Ancestor <- get(Ancestor.name, envir=GOSemSimEnv)
302 ancestor1 <- unlist(Ancestor[GOID1])
303 ancestor2 <- unlist(Ancestor[GOID2])
304 if (GOID1 == GOID2) {
305 commonAncestor <- GOID1
306 } else if (GOID1 %in% ancestor2) {
307 commonAncestor <- GOID1
308 } else if (GOID2 %in% ancestor1) {
309 commonAncestor <- GOID2
310 } else {
311 commonAncestor <- intersect(ancestor1, ancestor2)
312 }
313 if (length(commonAncestor) == 0) return (NA)
314 pms <- max(Info.contents[commonAncestor], na.rm=TRUE)/rootCount
315 sim<-switch(measure,
316 Resnik = pms,
317 Lin = pms/(p1+p2),
318 Jiang = 1 - min(1, -2*pms + p1 + p2),
319 Rel = 2*pms/(p1+p2)*(1-exp(-pms*rootCount))
320 )
321 return (sim)
322 }
Let’s look at some helper functions in ppiPre:
477 rebuildICdata <- function(){
478 ont <- c("MF","CC", "BP")
479 species <- c("human", "rat", "mouse", "fly", "yeast", "zebrafish", "arabidopsis","worm", "ecolik12", "bovine","canine","anopheles","ecsakai","chicken","chimp","malaria","rhesus","pig","xenopus","coelicolor")
480 cat("------------------------------------\n")
481 cat("calulating Information Content...\nSpecies:\t\tOntology\n")
482 for (i in ont) {
483 for (j in species) {
484 cat(j)
485 cat("\t\t\t")
486 cat(i)
487 cat("\n")
488 TCSSComputeIC(ont=i, organism=j)
489 }
490 }
491 cat("------------------------------------\n")
492 message("done...")
493 }
Again, it is identical to GOSemSim 1.6.8:
390 rebuildICdata <- function(){
391 ont <- c("MF","CC", "BP")
392 species <- c("human", "rat", "mouse", "fly", "yeast", "zebrafish", "arabidopsis","worm", "ecolik12", "bovine","canine","anopheles","ecsakai","chicken","chimp","malaria","rhesus","pig","xenopus")
393 cat("------------------------------------\n")
394 cat("calulating Information Content...\nSpecies:\t\tOntology\n")
395 for (i in ont) {
396 for (j in species) {
397 cat(j)
398 cat("\t\t\t")
399 cat(i)
400 cat("\n")
401 ygcCompute_Information_Content(ont=i, organism=j)
402 }
403 }
404 cat("------------------------------------\n")
405 print("done...")
406 }
Let’s look at the internal function TCSSComputeIC in ppiPre:
410 TCSSComputeIC <- function(dropCodes="IEA", ont, organism) {
411 message("Calulating IC...")
412 wh_ont <- match.arg(ont, c("MF", "BP", "CC"))
413 wh_organism <- match.arg(organism, c("human", "fly", "mouse", "rat", "yeast", "zebrafish", "worm", "arabidopsis", "ecolik12", "bovine","canine","anopheles","ecsakai","chicken","chimp","malaria","rhesus","pig","xenopus", "coelicolor"))
414 CheckAnnotationPackage(wh_organism)
415 gomap <- switch(organism,
416 human = org.Hs.egGO,
417 fly = org.Dm.egGO,
418 mouse = org.Mm.egGO,
419 rat = org.Rn.egGO,
420 yeast = org.Sc.sgdGO,
421 zebrafish = org.Dr.egGO,
422 worm = org.Ce.egGO,
423 arabidopsis = org.At.tairGO,
424 ecoli = org.EcK12.egGO,
425 bovine = org.Bt.egGO,
426 canine = org.Cf.egGO,
427 anopheles = org.Ag.egGO,
428 ecsakai = org.EcSakai.egGO,
429 chicken = org.Gg.egGO,
430 chimp = org.Pt.egGO,
431 malaria = org.Pf.plasmoGO,
432 rhesus = org.Mmu.egGO,
433 pig = org.Ss.egGO,
434 xenopus = org.Xl.egGO,
435 coelicolor = org.Sco.egGO
436 )
437
438 mapped_genes <- mappedkeys(gomap)
439 gomap = AnnotationDbi::as.list(gomap[mapped_genes])
440 if (!is.null(dropCodes)){
441 gomap<-sapply(gomap,function(x) sapply(x,function(y) c(y$Evidence %in% dropCodes, y$Ontology %in% wh_ont)))
442 gomap<-sapply(gomap, function(x) x[2,x[1,]=="FALSE"])
443 gomap<-gomap[sapply(gomap,length) >0]
444 }else {
445 gomap <- sapply(gomap,function(x) sapply(x,function(y) y$Ontology %in% wh_ont))
446 }
447
448 goterms<-unlist(sapply(gomap, function(x) names(x)), use.names=FALSE) # all GO terms appearing in an annotation
449 goids <- toTable(GOTERM)
450
451 goids <- unique(goids[goids[,"Ontology"] == wh_ont, "go_id"])
452 gocount <- table(goterms)
453 goname <- names(gocount) #goid of specific organism and selected category.
454
455 go.diff <- setdiff(goids, goname)
456 m <- double(length(go.diff))
457 names(m) <- go.diff
458 gocount <- as.vector(gocount)
459 names(gocount) <- goname
460 gocount <- c(gocount, m)
461
462 Offsprings.name <- switch(wh_ont,
463 MF = "MFOffsprings",
464 BP = "BPOffsprings",
465 CC = "CCOffsprings"
466 )
467 if (!exists(Offsprings.name, envir=ppiPreEnv)) {
468 TCSSGetOffsprings(wh_ont)
469 }
470 Offsprings <- get(Offsprings.name, envir=ppiPreEnv)
471 cnt <- sapply(goids,function(x){ c=gocount[unlist(Offsprings[x])]; gocount[x]+sum(c[!is.na(c)])})
472 names(cnt) <- goids
473 IC<- -log(cnt/sum(gocount))
474 message("done...")
475 return (IC)
476 }
and ygcCompute_Information_Content in GOSemSim 1.6.8:
326 ygcCompute_Information_Content <- function(dropCodes="NULL", ont, organism) {
327 wh_ont <- match.arg(ont, c("MF", "BP", "CC"))
328 wh_organism <- match.arg(organism, c("human", "fly", "mouse", "rat", "yeast", "zebrafish", "worm", "arabidopsis", "ecolik12", "bovine","canine","anopheles","ecsakai","chicken","chimp","malaria","rhesus","pig","xenopus"))
329 ygcCheckAnnotationPackage(wh_organism)
330 gomap <- switch(wh_organism,
331 human = org.Hs.egGO,
332 fly = org.Dm.egGO,
333 mouse = org.Mm.egGO,
334 rat = org.Rn.egGO,
335 yeast = org.Sc.sgdGO,
336 zebrafish = org.Dr.egGO,
337 worm = org.Ce.egGO,
338 arabidopsis = org.At.tairGO,
339 ecolik12 = org.EcK12.egGO,
340 bovine = org.Bt.egGO,
341 canine = org.Cf.egGO,
342 anopheles = org.Ag.egGO,
343 ecsakai = org.EcSakai.egGO,
344 chicken = org.Gg.egGO,
345 chimp = org.Pt.egGO,
346 malaria = org.Pf.plasmoGO,
347 rhesus = org.Mmu.egGO,
348 pig = org.Ss.egGO,
349 xenopus = org.Xl.egGO
350 )
351 mapped_genes <- mappedkeys(gomap)
352 gomap = AnnotationDbi::as.list(gomap[mapped_genes])
353 if (!is.null(dropCodes)){
354 gomap<-sapply(gomap,function(x) sapply(x,function(y) c(y$Evidence %in% dropCodes, y$Ontology %in% wh_ont)))
355 gomap<-sapply(gomap, function(x) x[2,x[1,]=="FALSE"])
356 gomap<-gomap[sapply(gomap,length) >0]
357 }else {
358 gomap <- sapply(gomap,function(x) sapply(x,function(y) y$Ontology %in% wh_ont))
359 }
360
361 goterms<-unlist(sapply(gomap, function(x) names(x)), use.names=FALSE) # all GO terms appearing in an annotation
362 goids <- toTable(GOTERM)
363 # all go terms which belong to the corresponding category..
364 goids <- unique(goids[goids[,"Ontology"] == wh_ont, "go_id"])
365 gocount <- table(goterms)
366 goname <- names(gocount) #goid of specific organism and selected category.
367 ## ensure goterms not appearing in the specific annotation have 0 frequency..
368 go.diff <- setdiff(goids, goname)
369 m <- double(length(go.diff))
370 names(m) <- go.diff
371 gocount <- as.vector(gocount)
372 names(gocount) <- goname
373 gocount <- c(gocount, m)
374
375 Offsprings.name <- switch(wh_ont,
376 MF = "MFOffsprings",
377 BP = "BPOffsprings",
378 CC = "CCOffsprings"
379 )
380 if (!exists(Offsprings.name, envir=GOSemSimEnv)) {
381 ygcGetOffsprings(wh_ont)
382 }
383 Offsprings <- get(Offsprings.name, envir=GOSemSimEnv)
384 cnt <- sapply(goids,function(x){ c=gocount[unlist(Offsprings[x])]; gocount[x]+sum(c[!is.na(c)])})
385 names(cnt) <- goids
386 IC<- -log(cnt/sum(gocount))
387 save(IC, file=paste(paste("Info_Contents", wh_ont, organism, sep="_"), ".rda", sep=""))
388 }
Another helper function GetGOMap in ppiPre:
308 GetGOMap <- function(organism="yeast") {
309 if(!exists("ppiPreEnv")) .initial()
310 CheckAnnotationPackage(organism) #download and install the packages
311 species <- switch(organism,
312 human = "Hs",
313 fly = "Dm",
314 mouse = "Mm",
315 rat = "Rn",
316 yeast = "Sc",
317 zebrafish = "Dr",
318 worm = "Ce",
319 arabidopsis = "At",
320 ecolik12 = "EcK12",
321 bovine = "Bt",
322 canine = "Cf",
323 anopheles = "Ag",
324 ecsakai = "EcSakai",
325 chicken = "Gg",
326 chimp = "Pt",
327 malaria = "Pf",
328 rhesus = "Mmu",
329 pig = "Ss",
330 xenopus = "Xl",
331 coelicolor = "Sco"
332 )
333
334 gomap <- switch(organism,
335 human = org.Hs.egGO,
336 fly = org.Dm.egGO,
337 mouse = org.Mm.egGO,
338 rat = org.Rn.egGO,
339 yeast = org.Sc.sgdGO,
340 zebrafish = org.Dr.egGO,
341 worm = org.Ce.egGO,
342 arabidopsis = org.At.tairGO,
343 ecoli = org.EcK12.egGO,
344 bovine = org.Bt.egGO,
345 canine = org.Cf.egGO,
346 anopheles = org.Ag.egGO,
347 ecsakai = org.EcSakai.egGO,
348 chicken = org.Gg.egGO,
349 chimp = org.Pt.egGO,
350 malaria = org.Pf.plasmoGO,
351 rhesus = org.Mmu.egGO,
352 pig = org.Ss.egGO,
353 xenopus = org.Xl.egGO,
354 coelicolor = org.Sco.egGO
355 )
356
357 assign(eval(species), gomap, envir=ppiPreEnv)
358 }
My ygcGetGOMap in GOSemSim 1.6.8:
100 ygcGetGOMap <- function(organism="human") {
101 if(!exists("GOSemSimEnv")) .initial()
102 ygcCheckAnnotationPackage(organism)
103 species <- switch(organism,
104 human = "Hs",
105 fly = "Dm",
106 mouse = "Mm",
107 rat = "Rn",
108 yeast = "Sc",
109 zebrafish = "Dr",
110 worm = "Ce",
111 arabidopsis = "At",
112 ecolik12 = "EcK12",
113 bovine = "Bt",
114 canine = "Cf",
115 anopheles = "Ag",
116 ecsakai = "EcSakai",
117 chicken = "Gg",
118 chimp = "Pt",
119 malaria = "Pf",
120 rhesus = "Mmu",
121 pig = "Ss",
122 xenopus = "Xl"
123 )
124 gomap <- switch(organism,
125 human = org.Hs.egGO,
126 fly = org.Dm.egGO,
127 mouse = org.Mm.egGO,
128 rat = org.Rn.egGO,
129 yeast = org.Sc.sgdGO,
130 zebrafish = org.Dr.egGO,
131 worm = org.Ce.egGO,
132 arabidopsis = org.At.tairGO,
133 ecolik12 = org.EcK12.egGO,
134 bovine = org.Bt.egGO,
135 canine = org.Cf.egGO,
136 anopheles = org.Ag.egGO,
137 ecsakai = org.EcSakai.egGO,
138 chicken = org.Gg.egGO,
139 chimp = org.Pt.egGO,
140 malaria = org.Pf.plasmoGO,
141 rhesus = org.Mmu.egGO,
142 pig = org.Ss.egGO,
143 xenopus = org.Xl.egGO
144 )
145 assign(eval(species), gomap, envir=GOSemSimEnv)
146 }
There are many other small helper functions that are identical. ppiPre copy most of the source code of GOSemSim. There is 862 lines in GOKEGGSims.r, in which only the following function is about KEGG that is not related to GOSemSim.
10 KEGGSim <- function(protein1, protein2) # KEGG-based similarity of two proteins
11 {
12
13 if(!require("KEGG.db")){ stop("package KEGG.db is needed.")}
14 Pathway1 <- KEGG.db::KEGGEXTID2PATHID[[protein1]]
15 Pathway2 <- KEGG.db::KEGGEXTID2PATHID[[protein2]]
16 intersec <- length(na.omit(match(Pathway1, Pathway2)))
17 if(intersec==0)
18 sim<-0
19 else
20 sim<-intersec/(length(Pathway1)+length(Pathway2)-intersec)
21 return(sim)
22 }
This function is only 12 lines, and it calculates the similarity by divide the intersect to the total sum. The other lines in GOKEGGSims.r, more than 800 lines, were totally copied from GOSemSim. Other source files in the ppiPre only has less than 450 lines in sum. About 2/3 of ppiPre were copied from GOSemSim.
The author of ppiPre changed the function names and pretend it is their original works. They just copy and paste and take the credit of months of development of GOSemSim. This is really sucks.
After I found this issue, I add a proper use of GOSemSim statement in its github page:
I am very glad that many people find GOSemSim useful and GOSemSim has been cited by 114 (by google scholar, Aug, 2014).
There are two R packages BiSEp and tRanslatome depend on GOSemSim and three R packages clusterProfiler, DOSE and Rcpi import GOSemSim.
SemDist package copy some of the source code from GOSemSim with acknowledging within source code and document.
ppiPre package copy many source code from GOSemSim without any acknowledgement in souce code or document and did not cited GOSemSim in their publication. This violates the restriction of open source license.
For R developers, if you found functions provided in GOSemSim useful, please depends or imports GOSemSim. If you would like to copy and paste source code, you should acknowledge the source code was copied/derived from GOSemSim authored by Guangchuang Yu guangchuangyu@gmail.com within source code, add GOSemSim in Suggests field and also includes the following reference in the man files for functions that copied/derived from GOSemSim and cited in vignettes.
\references{ Yu et al. (2010) GOSemSim: an R package for measuring semantic similarity among GO terms and gene products \emph{Bioinformatics} (Oxford, England), 26:7 976–978, April 2010. ISSN 1367-4803 \url{http://bioinformatics.oxfordjournals.org/cgi/content/abstract/26/7/976} PMID: 20179076 }
You are welcome to use GOSemSim in the way you like, but please cite it and give it the proper credit. I hope you can understand.