Differences between revisions 3 and 4
Revision 3 as of 2007-04-14 02:02:19
Size: 2883
Editor: AlexPico
Comment:
Revision 4 as of 2007-04-14 02:11:51
Size: 4152
Editor: AlexPico
Comment:
Deletions are marked like this. Additions are marked like this.
Line 16: Line 16:

== How to Comment ==

To view/add comments, click on any of 'Comment' links below. By adding your ideas to the Wiki directly, we can more easily organize everyone's ideas, and keep clear records. Be sure to include today's date and your name for each comment. Here is an example to get things started: ["/Comment"].

'''Try to keep your comments as concrete and constructive as possible. For example, if you find a part of the RFC makes no sense, please say so, but don't stop there. Take the extra step and propose alternatives.'''
 re. Take the extra step and propose alternatives.'''
Line 31: Line 26:
Biologists often need to translate one id to another to connect various pieces of data.

The mappings are available, but not easily accessible in Cytoscape currently.
 * Biologists often need to translate one id to another to connect various pieces of data.
 * The mappings are available, but not easily accessible in Cytoscape currently.
 * Additional Use cases from [http://baderlab.org/IdentifierMapping Bader Lab ID Mapping page]:
  1. Unification during dataset merging: During a merge operation e.g. of two protein-protein interaction datasets from independently created databases, it is vital to recognize that two protein objects, one from each data source, represent the same protein molecule, even if the protein objects don’t share any database accession numbers. Unification requires knowledge of record type e.g. you cannot reliably use a gene ID to unify proteins (mostly because splice variants exist).
  1. Link out to related references: When presenting information about a protein to a user on a web page, it is useful to display links to related information about the protein, such as further information about the protein sequence and sequence feature annotations (e.g. in UniProt), Gene Ontology annotations, domains annotations (InterPro), etc.
  1. Identifier translation: Some analysis methods require specific translations from one set of identifiers to another. For instance, our ‘activity centers’ analysis requires translation from protein or gene identifiers in a pathway database to Affymetrix probe set identifiers or other gene expression array platform identifiers.
  1. Searching for a favorite gene name: Preferred gene names used for querying a pathway database should return all genes/proteins with that name, if they exist in the database. Unlike database accession numbers, gene names are not guaranteed unique, thus cannot reliably be used for the other use cases.
  1. Special case of identifier translation between species via orthology links.
Line 64: Line 64:
Bader Lab's ID Mapping Use Cases and References
 * http://baderlab.org/IdentifierMapping

RFC Name : Web Services - ID Mapping

Editor(s): Sarah, Ethan, Alex

TableOfContents([2])

About this document

This is an official Request for Comment (RFC) for Add your text here.

For details on RFCs in general, check out the [http://www.answers.com/main/ntquery?method=4&dsid=2222&dekey=Request+for+Comments&gwp=8&curtab=2222_1&linktext=Request%20for%20Comments Wikipedia Entry: Request for Comments (RFCs)]

Status

April 13, 2007

Not yet completely written

  • re. Take the extra step and propose alternatives.

Proposal

Enhance Cytoscape’s data connectivity by putting together a basic API for a web service multiple groups implement as a test case for further development into web service/database connectivity for Cytoscape.

This case will deal with an id mapping/translation service.

Biological Questions / Use Cases

  • Biologists often need to translate one id to another to connect various pieces of data.
  • The mappings are available, but not easily accessible in Cytoscape currently.
  • Additional Use cases from [http://baderlab.org/IdentifierMapping Bader Lab ID Mapping page]:

    1. Unification during dataset merging: During a merge operation e.g. of two protein-protein interaction datasets from independently created databases, it is vital to recognize that two protein objects, one from each data source, represent the same protein molecule, even if the protein objects don’t share any database accession numbers. Unification requires knowledge of record type e.g. you cannot reliably use a gene ID to unify proteins (mostly because splice variants exist).
    2. Link out to related references: When presenting information about a protein to a user on a web page, it is useful to display links to related information about the protein, such as further information about the protein sequence and sequence feature annotations (e.g. in UniProt), Gene Ontology annotations, domains annotations (InterPro), etc.

    3. Identifier translation: Some analysis methods require specific translations from one set of identifiers to another. For instance, our ‘activity centers’ analysis requires translation from protein or gene identifiers in a pathway database to Affymetrix probe set identifiers or other gene expression array platform identifiers.
    4. Searching for a favorite gene name: Preferred gene names used for querying a pathway database should return all genes/proteins with that name, if they exist in the database. Unlike database accession numbers, gene names are not guaranteed unique, thus cannot reliably be used for the other use cases.
    5. Special case of identifier translation between species via orthology links.

General Notes

Requirements

Deferred Items

Open Issues

Backward Compatibility

Expected growth and plan for growth

References

ISB’s Current ID Service:

  • attachment:ISB_SynonymService.doc - description of service/methods

  • attachment:ISB_Mammalian_WSDL.xml - example wsdl from our mammalian service

Note that while it is implemented in SOAP, we get REST along with it. Any of the described methods can be called.

ISB prototype UDDI/id mapping plugin. Our endpoints are behind a firewall, but at least it's some example type code (very alpha)

Services to look at for reference points:

Bader Lab's ID Mapping Use Cases and References

Implementation Plan

  1. Alex, Sarah, Ethan will work together towards a common web service api for id mapping
  2. Implement a discovery/use plugin prototype

Comments

WebServicesIDMapping (last edited 2009-02-12 01:03:35 by localhost)

Funding for Cytoscape is provided by a federal grant from the U.S. National Institute of General Medical Sciences (NIGMS) of the Na tional Institutes of Health (NIH) under award number GM070743-01. Corporate funding is provided through a contract from Unilever PLC.

MoinMoin Appliance - Powered by TurnKey Linux