RFC Name : ... |
Editor(s): ... |
About this document
This is an official Request for Comment (RFC) for Add your text here.
For details on RFCs in general, check out the [http://www.answers.com/main/ntquery?method=4&dsid=2222&dekey=Request+for+Comments&gwp=8&curtab=2222_1&linktext=Request%20for%20Comments Wikipedia Entry: Request for Comments (RFCs)]
Status
7/20/2006: Open for public comment.
How to Comment
To view/add comments, click on any of 'Comment' links below. By adding your ideas to the Wiki directly, we can more easily organize everyone's ideas, and keep clear records. Be sure to include today's date and your name for each comment. Here is an example to get things started: ["/Comment"].
Try to keep your comments as concrete and constructive as possible. For example, if you find a part of the RFC makes no sense, please say so, but don't stop there. Take the extra step and propose alternatives.
Proposal
BioDataServer class was used to import Ontologies, annotations, and synonyms. Basically, the constructor takes manifest file location and load data from individual data sources (annotation files, ontology file, and synonym file) specified in the manifest file. Current problems are the following:
- File format used in the manifest file is out-of-date.
- New file formats (OBO and Gene Association) are converted into old format before loading.
- Many entries in the new file formats are lost in the file format conversion process.
- The imported ontologies are stored in a huge map, not in a DAG which is the original data structure of GO.
- Because GO terms are imported as a huge map, it makes no sense for many biologists.
- Name mapping service is not sophisticated.
GO Annotations are mapped based on levels, which does not make sense for biologists.
To solve problems above, new BioDataServer should supports the following:
- Import everything in the OBO and Gene Association files.
Use CyNetwork class to store GO's DAG (like BinGO plugin)
- Import more general attribute files, not only GO.
Supports more general attributes. (Probably we should change the name to AttributeServer)
Directly convert attibutes into CyAttributes data structure, and avoid redundant data.
- Support for MySQL (popular DB in lifescience projects) connection.
- Support for XML attribute import based on XQuery or other light weight library
- Remote and local data source support.
Biological Questions / Use Cases
- Ontology can be anything which has tree/dag datastructure since there are many ontologies in life science field.
List of ontologies available in [http://obo.sourceforge.net/browse.html# OBO format].
General Notes
There are several open-source applications/libraries which can be used in this project:
[http://www.cb.k.u-tokyo.ac.jp/aritalab/oka/software.html GO Viewer]
[http://www.psb.ugent.be/cbd/papers/BiNGO/ BiNGO Plugin]
Requirements
Deferred Items
Open Issues
- Visualization and interaction with large ontology (like full-size GO)
Backward Compatibility
Expected growth and plan for growth
References
Implementation Plan
- Use incremental approach.
- This project can be devided into two parts.
- New Ontology Server design (data representation)
- UI for browsing/serching ontologies
- For 2.4, we will forcus on the first part.
The new ontology model will be based on [http://www.biojava.org/docs/api14/org/biojava/ontology/package-summary.html BioJava's Ontology interface].
- Cytoscape version of Ontology class is an implementation of Interface Ontology.
This class uses CyNetwork as the data strage to represent the Ontology DAG.
This special CyNetwork will be stored in a different RootGraph.
Information belongs to an Ontology Term (such as reference, description, etc.) will be stored in CyAttributes.
Since this new Ontology class uses CyNetwork and CyAttributes as its backend, we can use all functionality of Cytoscape to this Ontology DAG. The picture below is an example of GO DAG visualization:
- Write reader load all entries in GO
- Integrate ID Mapper
Current Implementation
How to use
- Import Ontology
- Map Annotation