Summary
After the Cytoscape retreat held in San Diego in 2005, it became obvious that Cytoscape needs to support metanodes. A metanode is a graph node that contains a subgraph. It is a mechanism to group nodes in a graph, and it does not concern the criteria followed to determine membership of a group. There are many different ways of modeling and visualizing metanodes, depending on the biological application in question.
The objectives of the metanodes group is to:
- Clearly identify the biological applications for metanodes
- Clearly define what type of metanode visualization and modeling is needed for each one of these applications
- Come up with a project plan to implement these different ways of visualizing/modeling metanodes in Cytoscape
This RFC is a forum for discussing each one of these objectives.
Iliana Avila developed a Cytoscape plugin that implements one possible way of modeling and visualizing metanodes. This plugin can be used as a concept plugin, so that everyone involved has a clearer idea of what is a metanode, and how it can be used. It is by no means the final implementation of metanodes. See "Concept Plugin" section in this Wiki to learn how to obtain and use the plugin.
Biological Applications and their MetaNode Needs
Please add your biological application, and a possible metanode solution.
1. Biomodules
Biological application: Group proteins in a graph of protein-protein interactions that have a collective function in the cell (http://www.genome.org/cgi/content/abstract/14/3/380) in order to discern higher levels of organization in the biological network
Metanode solution: A group of proteins can be visualized by a single node that has visual and topological characteristics that reflect the underlying group of proteins. For example, the size of the node is proportional to the number of proteins it represents, its connections to other proteins reflect connections from its inside proteins to other proteins, its color represents the average expression levels of its members for a certain condition, etc. See the image in http://labs.systemsbiology.net/galitski/projs/biomodules/index.html. Round nodes are metanodes.
2. Protein Complexes - Pico/GenMAPP (note by Cline/Pasteur)
- Biological application: Group proteins in a pathway that are known to form complexes in order to simplify visualization and store known associations in the data model.
Metanode solution: Ideally, there could be two views of protein complexes. (1) A collapsed view, similar to that used in Biomodules above, but with a default size (not scaled by number of members) and a label that is unique to the metanode (i.e., PKA complex). (2) A stacked view, where all the children nodes are visible and simply stacked (like gene boxes in GenMAPP).
- Extensions: Note that the solution should also fit for protein domains since the particular boundaries between protein domains in a single chain and between proteins in a complex is rather arbitrary, a matter of evolutionary fate. The solution should also extend to the grouping of paralogs and splice variants.
- Further note (Cline/Pasteur): these extensions become especially interesting, now that there are high-throughput platforms to measure separate expression levels of genomic features. Biologically, the likelihood of a given interaction will depend on the isoforms produced in the cell, and whether or not the protein features involved in the interactions are expressed.
- - One Metanode implentation is for each child node to represent a different component of the gene or protein - an exon, or a protein domains. Where the right data is available, interaction would be tied to the components involved in the interaction - much as is done now in the Domain Network plugin.
- A second Metanode implementation would have each child node represent a protein isoform of the parent. Again, where the right data is available, interactions would be associated with the isoforms that can interact. Here, some thought should go into how to handle the edges. If one metanode in an interaction has N child nodes, representing N protein isoforms, and the second has M modes representing M isoforms, having up to N*M different edges represents a lot of complexity.
- - One Metanode implentation is for each child node to represent a different component of the gene or protein - an exon, or a protein domains. Where the right data is available, interaction would be tied to the components involved in the interaction - much as is done now in the Domain Network plugin.
Note: The stacked view is not merely a visualization problem that can be solved by having the ability of viewing different sections within a single node (coloring them differently, etc). This is because we wish to have edges connected to each individual component of the stack. Because of this reason, this is indeed a biological application that can be solved using metanodes.
3. Intragenic Features - Pico/GenMAPP
- Biological application: Associate features such as exon structure, promoter regions and SNP positions with proteins in a pathway. These features are quickly becoming the preferred level of abstraction for microarray analysis and other high-throughput methods. We must be able to translate these massive datasets into biological context (i.e., pathways) in an efficient manner.
Metanode solution: By associating these feature-level nodes with a protein node in a parent-child relationship, we could efficient map these data types to the biology at the pathway level. These associations might be best viewed as collapsed nodes colored by specified algorithms that consider the data type (e.g., a splicing analysis on all exon data mapped to the whole gene). And instead of expanding the node on the same network, perhaps we could restrict the expansion to a new network (like a small pop up window) that displays the feature-level nodes and direct data mapping, e.g., expression level for each exon associated with the protein. See attached image for an illustration of the pop-up window.
- Note: This type of metanode seems to be qualitatively different and might require separate terminology. Then, again, maybe not?
- Note2: After a group discussion, it was decided that this biological application does not require a metanode solution. It can be solved by implemeting a Cytoscape plugin that makes use of already existing Cytoscape functionality.
4. Boxing of groups.
- An additional desired way of visualizing groups of nodes is to box them. The box itself is not a node, it is just an enclosing area for a group that can be dragged around while the nodes move with it. In this case, the metanodes can be used a a mechanism to group nodes, to keep track of these groupings, and to modify these groupings (removing or adding membership). But the metanode itself is not visualized. All of the biological applications above can be solved using this boxing visualization.
Comment biological applications
Implementation Strategy
Note
Look here SimplifiedMetaNodeDataStructureRFC to learn about how this implementation could be affected in the future.
General Strategy
The strategy will be to add to Cytoscape metanode support in the form of a core plugin and a core library for the 2.3 release. This will allow users to do the following:
- Manually create metanodes by selecting nodes using the Cytoscape GUI
- Programmatically creating metanodes by calling methods in a metanode API that will be shiped with Cytoscape (in its lib/ directory)
- Save metanode information to a network file so that it is not lost for later sessions
For 2.4, we will more fully integrate metanodes with the Cytoscape core. This entails making the core aware of metanodes (for example, when the users searches for a node by its name, the core should not only search visible nodes in a network, but also non-visible children nodes of metanodes). For 2.4, other core plugins (like the editor, the attributes browser, and filters) should also be made aware of the existence of metanodes.
Specific Action Items for Each Release
Cytoscape 2.3
Define a general API for the metanodes library (main architect: Iliana, others in metanode group review it) API
- For 2.3 we will use Iliana's metaNodesViewer plugin as a back-end implementation of the API that can be expanded,modified,optimized or even replaced in the future.
Saving of metanodes state (Kei who consults with metanodes group. Metanode_In_XGMML)
- Add an option to "stack" children nodes of a metanode into a single aligned column (not assigned yet)
- Add an option to automatically change children nodes' positions if the parent node is moved and expanded (not assigned yet)
- Add to metanodes an "isMetanode" attribute with boolean values so that users can set visual properties of metanodes using Cytoscape's visual styles dialog (Iliana, this may be already done)
- Explore with Nerius (Cytoscape's graphics rendering expert) the "boxing" idea to group nodes (see point 4 in the section above)
Other possible items are:
- Automatically expanding a metanode when users double click on it (we would need to reserve the double click operation so that other parts of Cytoscape and other plugins don't also make use of the double click operation)
Cytoscape 2.4
No defined specific action items yet
Concept Plugin
CVS
Login anonymously into CVS and checkout csplugins:
cvs -d :pserver:anonymous@bordeaux.ucsd.edu:/cvsdir5 login
cvs -d :pserver:anonymous@bordeaux.ucsd.edu:/cvsdir5 co csplugins
cvs -d :pserver:anonymous@bordeaux.ucsd.edu:/cvsdir5 logout
The plugin is located in: /csplugins/isb/iavila/metaNodeViewer. Edit the build.xml file if necessary to point to the correct Cytoscape path (works with latest Cytoscape version). Type: ant run. The plugin is available in the plugins menu, and as a right click context menu (which seems to not appear sometimes...)