Molecular Interaction Maps

Editor(s): Allan Kuchinsky

<<TableOfContents: execution failed [Argument "maxdepth" must be an integer value, not "[2]"] (see also the log)>>

About this document

This is an official Request for Comment (RFC) for Supporting Molecular Interaction Maps in Cytoscape.

For details on RFCs in general, check out the Wikipedia Entry: Request for Comments (RFCs)

Status

* work in progress. Consolidation of issues from use cases done and written to a preliminary Requirements section.

How to Comment

To view/add comments, click on any of 'Comment' links below. By adding your ideas to the Wiki directly, we can more easily organize everyone's ideas, and keep clear records. Be sure to include today's date and your name for each comment. Here is an example to get things started: /Comment.

Try to keep your comments as concrete and constructive as possible. For example, if you find a part of the RFC makes no sense, please say so, but don't stop there. Take the extra step and propose alternatives.

Proposal

A Molecular Interaction Map (MIM) is a diagram convention that is capable of unambiguous representation of networks containing multi-protein complexes, protein modifications, and enzymes that are substrates of other enzymes. MIMs are described in detail at http://discover.nci.nih.gov/mim/index.jsp and in a seminal article by Kohn et al at http://www.molbiolcell.org/cgi/content/abstract/E05-09-0824v1

An example MIM is below mim_eg.png

MIMs have been developed at the National Cancer Institute (specifically the Laboratory of Molecular Pharmacology). They are looking into what it would take to develop an Editor to create Molecular Interaction Maps (MIMs) in an editor to yield both a graphical view and a computer-readable format (preferrably BioPax). They are interested in seeing if such a MIM editor could be developed for Cytoscape.

MIM symbols are defined in the figure below:

Kohn_SymbDefH3.jpg

QUESTION: Is this the most recent definition of map symbols? In particular, are site-specific constructs, such as cleavage, still defined as requirements? ANSWER:There are a few changes. CovalentBinding has a new symbol. There are some requirements for site-specific constructs. There are a few additional symbols as well.

This proposal will specify the mapping between MIM constructs and their proposed counterparts in Cytoscape. A prioritized set of requirements will be presented. Enhancements needed in the editor, renderer, and/or other Cytoscape components will be identified.

Biological Questions / Use Cases

/MimEditorUseCaseComments

Kohn Notation Use Cases

Editor System Use Cases

General Notes

From an initial reading of the article by Kohn et al at http://www.molbiolcell.org/cgi/content/abstract/E05-09-0824v1, an initial description of potential mappings between MIM and Cytoscape constructs was derived. This is summarized in the figure below.

A newer version of the description of the notation is found in Kohn et al http://mct.aacrjournals.org/cgi/rapidpdf/1535-7163.MCT-06-0640v2

kohn_mappings.png

Further Notes from meeting at 2006 Cytoscape Developers Retreat

  • Need to map MIM to both model and view
    • How to represent 'non covalent binding'?
      • Model: Best mapped to a group. A and B are connected and are part of the group.
      • View: this is a use case for the group view
    • Asymmetric binding
      • Model: Regular edge between A and B, just needs a specific edge type
      • View: New type of custom edge graphic (will take some care with Nerius' edge drawing code)
    • Representation of multimolecular complexes
      • Model: A group of groups
      • View: part of the use case for group view
    • Covalent modification of protein A (post-translational modification = PTM)
      • Model: A is a node (representing unmodified protein); central node is a node (represented modified A) - every state is represented as a node. The PTM e.g. P is not a node, it is an annotation on the 'central' state node. The edge between A and its state is of type 'state of'. The set of all states is grouped to represent the set of all states of A.
      • View: Constraint: the states always move together (would be like a template)
    • Cleavage of of a covalent modification of protein A
      • Model: Hyperedge containing node1: Phtase (hyperedge attribute: enzyme), node2: phosphorylated A, node3: unphosphorylated A
      • View: Use case for hyperedge view
  • General constraints for automatic layout and the editor
    • Snap to grid
    • All lines need to be routed with only right or acute angles
    • Central node is only there if it is used (connected to something else in the diagram)
  • Next step: a special interest group should get together to hash the above out further. Current interested people are: Allan, Mirit, David, Gary, Aditya, Scooter, Alex; also general interest from Nathan, Kristina (GenMAPP editing), Ethan, Ben (BioPAX editing)
  • Implementation: Cytoscape core would ensure that the model and view requirements are met. Users: like GenMAPP, MIM, BioPAX editing would use the core functionality to create their own plugins that implemented their own pathway editing features.

Requirements

The following set of requirements is drawn from a consolidation of comments on the use cases. The requirements fall into two categories:

  • Infrastructure: changes required to underlying Cytoscape functionality, such as modifications to the renderer.
  • Customization: changes that require new functionality to be written on top of existing Cytoscape infrastructure, for example new group views to support complexes.

One general comment is that, for performance reasons, we may want to consider whether changes to rendering should only be implemented for high level of detail zoom factors.

The major requirements are:

  1. Snap-to-grid functionality. This is required for supporting right angle geometry in the layout rules. Having nodes positioned on grid points will make it feasible to connect nodes with lines that are strictly vertical or horizontal. This also allows drawing lines at other angles, e.g. diagonally across two grid points to get a 45-degree angle. Infrastructure changes would be required in the following Cytoscape components:
    • renderer: nodes should snap to grid when being moved. InnerCanvas should display grid points, upon request from the user. Granularity of the grid and granularity of displayed grid points should be configurable by the user. When moving a Node that is attached to a segmented edge, the anchor points of the edge should remain fixed in position and the connecting sub-edge should be stretched or compressed, depending upon the direction of the movement. Implications for renderer performance: there shouldn't be too much impact for normal panning and zooming of the canvas, since nodes remain in their current coordinates and no recalculation of node coordinates is needed. However, moving a set of selected nodes within a large network could potentially affect performance of the renderer.

    • editor: nodes should be positioned on nearest grid point when dragged/dropped from the editor's palette. User should be able to draw multi-segment lines. That is, draw an edge to a particular grid point and the editor inserts a handle into the edge, then the user takes a 90-degree turn a continues to next chosen grid point.
    • graphical annotations: same implications as for renderer, with the additional requirement that a node should snap to grid when being stretched.
    • graph layout: we should have a specialized graph layout algorithm for MiMs. The existing orthogonal layout algorithm should be a good starting point.

    • Cytoscape desktop: the various Cytoscape modules need to know when they are in 'MiM' mode, i.e. when they need to enforce layout rules. This could be configurable by the user via an item on the Edit menu. Or it could be driven by a specialized 'MiMs' visual style. Is there a way to do this without having to add too much extra code in the editor and renderer for handling special cases.

    • readers and writers: when BioPAX is read in (and if we are in 'MIM' mode), then the specialized graph layout algorithm should be run and then the nodes should be offset to snap to the nearest grid point. When exporting a network to BioPAX, we need a way to save coordinate information. Does BioPAX support this or do we require a format such as XGMML for doing this?
  2. Grouping. This is required for supporting both composite objects, such as complexes and homodimers, as well as abstractions, such as in the use of dot notation to represent the state of an object. Customizations need to be built on top of the following Cytoscape components.
    • editor. It would be useful to have a library of predefined 'templates' which would handle composites such as state combination. This functionality is currently being provided in the HyperEdge editor for the use case of biochemical reactions. We should be able to build on top of that infrastructure. This does not mean that we necessarily provide functionality for the user to define their own templates.

    • CyGroup: a number of abstractions should be represented as CyGroups. This includes dot in assymetric binding, cellular compartments, XXXX. In many situations, the GroupNode should be used to represent a MiMs grouping entity, e.g. a cellular compartment and its view. Many of these will require specialized CyGroupViewer implementations.

    • renderer: many new CustomNodeGraphics will need to be built, e.g. to represent the boundaries of cellular compartments.

    • HyperEdges: we ought to be able to implement snap-to-grid support in the Cytoscape editor in a way that is transparent to HyperEdges. However, this may require some support in the HyperEdgeEditor, e.g. when drawing hyperedges.

    • readers and writers: how is information about CustomNodeGraphics output so that it can be applied again when a the MiM is read back in? Should there perhaps be a specialized BioPAX reader that knows how to handle MiMs and knows about the different CustomNodeGraphics available. Or should there be a 'View as MiM' entry on the View menu item? Perhaps we can encode the Node type as an attribute when the MiM is output, then the reader can look for this attribute on each Node that is created and do the right thing with respect to view.

  3. Sub-molecular entities: This is required to support binding sites, domains, spice variants and other information at a finer granularity than that of a protein or gene. The implications for Cytoscape infrastructure are:
    • CyGroup: in principle, we can use CyGroups to represent these, where the group is the gene or protein and the sub-molecular entities are children of the group. This may cause complications in that the grouping exists at a different level of the network than do groups of genes and proteins. Will this cause some network analysis functions, such as shortest path, to fail?

    • renderer: the renderer may have to render edges so that they connect to the boundary between two sub-molecular components, for example in the case of cleavage sites. We might want to introduce the context of a 'site' for a CyNode, which could be used to represent cleavage sites, splice junctions, and other sub-node entities. In the model, this could be implemented via the Group, each 'site' being a child of the group. In the view, this could be represented as a 1-pixel wide rectangle adjacent to the border of the node, or perhaps a 1-pixel square rectangle at a point on the node. This is basically a "port".

    • editor: should there be editor support for creating sub-molecular entities manually or should creation of sub-molecular entities be strictly data driven? It might be best to be data-driven for the immediate time frame.
    • readers/writers: does BioPAX represent sub-molecular entities as 'sequence features'? If so, then can a BioPAX reader be crafted to store and restore this information in the group structure.

  4. adding Evidence to a node or edge: it would be nice to be able to add structured evidence, such as citations, to Nodes and Edges of the MiM, perhaps via dragging/dropping a URL onto the Node or Edge. The evidence could be viewed in the Cytoscape Attribute Browser. Specialized readers/writers may be needed to get the structured data into and out of the system.
  5. new edge types: There is a diversity of edge types that MiMs support. Implications are mostly for the renderer and editor.

    • renderer: many of these new edge types require multi-segment edges. There are two ways that this could be handled. One alternative is that when an edge of a certain type is created in Cytoscape, then handles are inserted at certain points along the edge. This may require that the renderer perform type-specific rendering for different edge types, probably something we want to avoid. A second alternative derives from the observation that for most of the multi-segment edges, there are 3 segments, consisting of a longer straight segment, which runs from the source node to an inflection point, followed by an orthogonal segment at a point closer to the target node, then a final edge segment that runs parallel to the inital segment and terminates at the target node. If that is the case, then we could consider all of the segments that follow the initial inflection point to be part of the arrowhead for the edge. Thus, the edge type can be rendered like any other edge, with one segment followed by an arrowhead. Many new arrowhead types will have to be defined for Cytoscape so that they can be rendered.
    • editor: the editor will have to support many new edge types on its palette. This can build upon the enhancements made to the renderer to support the new edge types.
    • vizmapper: should these edge types be customizable using the vizmapper user interface? This may not be a good idea, since the visual characteristics of these edge types are completely coupled with their semantics and constitute a standard way of representing
  6. 'Constant' nodes: There are certain Nodes in MiMs, which represent Concepts and may appear multiple times in a network. Such 'constants' include degradation and 'mRNA'. Since Cytoscape shares nodes, we may want to represent distinct occurances of a 'constant' node as distinct entities which have the same label. Question: will this approach break many of the network analysis tools?

  7. 'Coordinate System'. This is kind of like a Legend in Cytoscape, except that it may need to update itself upon panning or zooming. Perhaps this could be implemented as a graphical annotation whose Component listens for ViewportChange events and repaints itself with the appropriate labels.

  8. Links on nodes. This is necessary to support Processes, wherein the user could jump to another map if the user performed a gesture, such as a doubleclick on a node that represents a process. This could be generalized to support the execution of an arbitrary function upon user gesture such as doubleclick.
  9. Boolean Logic: at the model level, this would require new connector node types of AND, OR, NOT for hyperdges. At the view level, it might be useful to have a 'template' in the editor, as defined below, for boolean logic. This functionality was rated of lower importance, so it should probably be deferred to a later implementation.
  10. handling 'first neighbor' functionality: this is required by PathBranching and PathHighlighting. Cytoscape's nearest neighbor functionality, with some minor extensions, ought to be able to handle this requirement.

Deferred Items

Open Issues

  1. Do they need site-specific connection of edges, e.g. as in proteolytic cleavage? From Mirit's presentation, it appears that metanodes would work just fine. ( see /ProteinBindingSites

and /ProteinDomains)

  1. How do we represent the "null" node, as in degradation product? (see /SpeciesDegradation)

  2. How do we represent "ditto" nodes, e.g. as in translocation? (see /ShowTransport)

  3. Would they need editor support for Manhattan topology, e.g. constraints on edge angles?

Backward Compatibility

Expected growth and plan for growth

References

Implementation Plan

Comments

Molecular_Interaction_Maps (last edited 2009-02-12 01:03:29 by localhost)

Funding for Cytoscape is provided by a federal grant from the U.S. National Institute of General Medical Sciences (NIGMS) of the Na tional Institutes of Health (NIH) under award number GM070743-01. Corporate funding is provided through a contract from Unilever PLC.

MoinMoin Appliance - Powered by TurnKey Linux