Size: 4407
Comment:
|
Size: 6476
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 5: | Line 5: |
|| '''RFC Name''' : ... || '''Editor(s)''': ... || | || '''RFC Name''' : CyNode identifier || '''Editor(s)''': Ben Gross || |
Line 29: | Line 29: |
* use unique numerical ids for nodes * get rid of CANONICAL_NAME and COMMON_NAME ["/Comment"] * add LABEL attribute * only use graph objects on methods, ie int [] getAdjacentEdges (Node node); |
* Remove m_identifier string from CyNode * CyNode unique ID string is generated by CyNode * User gets unique ID string for CyNode from CyNode * Split Cytoscape.getCyNode(string alias, boolean create) into createCyNode(..) and getCyNode(string uid) * Add LABEL attribute as a default node attribute * CyAttributes is keyed using unique ID String generated by respective CyNode * Any method that takes a string id to identify a CyNode should instead take a CyNode * remove CANONICAL_NAME and COMMON_NAME ["/Comment"] * only use graph objects, not node indices, as parameters to methods, i.e. int [] getAdjacentEdges (Node node); |
Line 45: | Line 50: |
In this rfc, the term "label" or "node label" has been used in place of the more historic term "name" or "node name". | In this RFC, the term "label" or "node label" has been used in place of the more historic term "name" or "node name". |
Line 48: | Line 53: |
* Cytoscape nodes have a unique identifier (ID). * Cytoscape nodes have a standard way of attaching a string label. * Cytoscape must be able to import from and export to SIF and GML. |
|
Line 49: | Line 57: |
* Use a unique ID that is generated by Cytoscape (maybe Strings parsable as numbers). * These IDs are NOT visible to users (for example in the attribute browser). * CANONICAL_NAME and COMMON_NAME attributes should go away and be replaced with a LABEL attribute. * SIF reader has to ensure that there are no duplicate labels. SIF reuses a label if it can find one. We are going to write SIF files. == Deferred Items == |
== Analysis == Cytoscape subsystems and their node identifier semantics * CyNetwork: Node ID in CyNode is a '''unique string''' (a GINY root graph index '''unique integer''' is also maintained and can be accessed by users). Semantics: the unique string is often expected to encode a gene name. Gene names are known not to be unique. '''This is a semantic conflict'''. * CyAttributes: A MultiHashMap using a '''unique string''' a key * SIF: Node ID is a '''unique string''', often expected to encode a gene name. This is mapped to the CyNode unique string ID. '''This is a semantic conflict'''. * GML: Node ID is a '''unique integer'''. A label attribute is available per node that is a string (does not have to be unique). This string label is mapped to CyNode unique string ID. '''This is a datatype conflict'''. * Node attributes file format: same semantics as SIF. * Expression data file format: same semantics as SIF, but more often used to store expression IDs, like Affymetrix probeset IDs, while SIF often stored gene names. '''SIF and expression data IDs are difficult to map to each other.''' * Attribute browser: currently displays CyNode string ID, canonical name, common name, aliases (these are often the same string - 'duplicated data''') * Merge plugin: merges based on CyNode string ID. This only works for networks where CyNode string IDs are part of the same 'namespace'. * CyEdge: uses 2 CyNode string IDs as part of its ID. * BioDataServer synonym table: Was formerly used for mapping IDs e.g. between SIF and expression data IDs, but is currently part of the code, but non-functional. |
Line 59: | Line 74: |
Line 64: | Line 78: |
Line 68: | Line 81: |
* What affect does this have on new Cytoscape session saving subsystem ? |
* What effect does this have on new Cytoscape session saving subsystem ? '''Edges:''' * Should edges have an ID, or is a nodeID-edgeType-nodeID good enough to uniquely identify edges? '''GraphMerge''' * Does graph merge depend on a unique string ID? Does it assume all identical nodes have the same root graph ID? |
Line 76: | Line 95: |
== Expected growth and plan for growth == == References == |
|
Line 92: | Line 107: |
attachment:CyNodeObjectModel.png |
RFC Name : CyNode identifier |
Editor(s): Ben Gross |
About this document
This is an official Request for Comment (RFC) for CyNode Identification
For details on RFCs in general, check out the [http://www.answers.com/main/ntquery?method=4&dsid=2222&dekey=Request+for+Comments&gwp=8&curtab=2222_1&linktext=Request%20for%20Comments Wikipedia Entry: Request for Comments (RFCs)]
Status
This RFC is still under construction and open for public comment. (01/17/06 -Ben)
How to Comment
To view/add comments, click on any of 'Comment' links below. By adding your ideas to the Wiki directly, we can more easily organize everyone's ideas, and keep clear records. Be sure to include today's date and your name for each comment. Here is an example to get things started: ["/Comment"].
Try to keep your comments as concrete and constructive as possible. For example, if you find a part of the RFC makes no sense, please say so, but don't stop there. Take the extra step and propose alternatives.
Proposal
Switch to a numbered node system instead of the current node ID as a string system. There should be a clear distinction between a node id and its label.
Remove m_identifier string from CyNode
- Split Cytoscape.getCyNode(string alias, boolean create) into createCyNode(..) and getCyNode(string uid)
- Add LABEL attribute as a default node attribute
CyAttributes is keyed using unique ID String generated by respective CyNode
Any method that takes a string id to identify a CyNode should instead take a CyNode
- remove CANONICAL_NAME and COMMON_NAME ["/Comment"]
- only use graph objects, not node indices, as parameters to methods, i.e. int [] getAdjacentEdges (Node node);
Biological Questions / Use Cases
Graph Editing:
- Ability to manipulate nodes that have not been labeled.
- Ability to have two or more unique nodes with the same label.
General Notes
In this RFC, the term "label" or "node label" has been used in place of the more historic term "name" or "node name".
Requirements
- Cytoscape nodes have a unique identifier (ID).
- Cytoscape nodes have a standard way of attaching a string label.
- Cytoscape must be able to import from and export to SIF and GML.
Analysis
Cytoscape subsystems and their node identifier semantics
CyNetwork: Node ID in CyNode is a unique string (a GINY root graph index unique integer is also maintained and can be accessed by users). Semantics: the unique string is often expected to encode a gene name. Gene names are known not to be unique. This is a semantic conflict.
CyAttributes: A MultiHashMap using a unique string a key
SIF: Node ID is a unique string, often expected to encode a gene name. This is mapped to the CyNode unique string ID. This is a semantic conflict.
GML: Node ID is a unique integer. A label attribute is available per node that is a string (does not have to be unique). This string label is mapped to CyNode unique string ID. This is a datatype conflict.
- Node attributes file format: same semantics as SIF.
Expression data file format: same semantics as SIF, but more often used to store expression IDs, like Affymetrix probeset IDs, while SIF often stored gene names. SIF and expression data IDs are difficult to map to each other.
Attribute browser: currently displays CyNode string ID, canonical name, common name, aliases (these are often the same string - 'duplicated data)
Merge plugin: merges based on CyNode string ID. This only works for networks where CyNode string IDs are part of the same 'namespace'.
BioDataServer synonym table: Was formerly used for mapping IDs e.g. between SIF and expression data IDs, but is currently part of the code, but non-functional.
Open Issues
Unique ID generation:
cytoscape.graph.dynamic.util.D cytoscape.giny.C nodes are created/retrieved through Cytoscape.getCyNode(), using cytoscape.giny.C cytoscape.data.readers.I cytoscape.editor.C attachment:CyNodeObjectModel.png
SIF File Format:
Edges:
GraphMerge Backward Compatibility
Importing/Exporting: Current Implementation Notes (2.2)
ynamicGraphRepresentation.nodeCreate() creates a unique node id as integer. Implementation Plan