Website of Frank Rügheimer

ontoMap - utility collection for working with Gene Ontology relations and annotations

Download

Description

The ontoMap package comprises a selection of *ix shell command line scripts that process specification files and annotations for Gene Ontology (http://geneontology.org). Please note, that all scripts where originally written to provide specific functionality within my own research projects and made available without any claim of comprehensive coverage. They may lack obvious features typically implemented in general purpose tools. For instance, not every tool will check inputs in detail or offer an option to filter terms by sub-ontology. More complex scripts provide documentation via command line help (accessible by calling the script with the -? option). Additional information may be obtained from the comments on the programs functions and their operation within the scripts themselves.

I do not actively maintain these scripts to keep up with newer releases of the GO, though improvements and updates will be made at irregular intervals.

Quick Start

  1. Extract the archive file
  2. Check your installation for sed, grep, awk and the graphviz package available at http://graphviz.org/ (it is used for computing transitive reductions and for visualization)
  3. Change to the ontoMap directory and run the ./demo script for a short introduction
    If the demo fails:
    • Check for is missing standard tools (e.g. those listed under (2))
    • Set write permission in the local directory (some scripts will need to create temporary files)
    • The scripts were tested with bash 4.1, but compatibility between different shells and shell versions may be an issue. In case of incompatibilities minor changes to the syntax or defining an unsupported function will often restore functionality though.
    • If these measures fail try to contact me via email (see footer) I will try to fix bugs, but please understand that I usually do not have time to expand the feature set
  4. Check the files in the /proc subdirectory to look at the processed results
  5. Optionaly apply included converter scripts to postprocess output for graphviz and Cytoscape
  6. Apply the clean script to erase the files and subdirectories generated by the demo

Notes on moving files

There are some dependencies between scripts. The calling scripts usually look for implementations of auxiliary functions in the /scripts subdirectory of the current working directory. This generally works nicely if a central makefile is used to control the data processing, image generation etc. within each project so modifications and new versions of datasets are (hopefully) integrated with just a few keystrokes. However some of the more complex scripts will break when the directory structure is reorganized, unless the relative path to their respective support scripts is preserved. Alternatively, the respective variables in the scripts can be adapted.

Copying

The scripts are distributed under the GNU LESSER GENERAL PUBLIC LICENSE (see file lgpl-2.1.txt). Data files used in the demonstration originate from third authors (listed below):

.obo files have been generated within the Gene Ontology project (http://geneontology.org). The files follow version 1.2 of the .obo format, were downloaded on August 17, 2010 and have been renamed for convenience.

The file gene_association.cgd_2010-08-17 (go annotation file) as originates from the Candida Genome Database (http://www.candidagenome.org) was downloaded from the website of the Gene Ontology project on August 17, 2010 and has been renamed for convenience.

Please consult file headers for copyright details and contact information regarding these data.