This page describes the Cytoscape architecture and a vision for where we want the architecture to be in the long-term.
Note: While we need to clearly decribe Cytoscape subsystems, it is important that most sections of this page not go into too much detail so that the overall architecture is apparent.
Why are there no biological semantics in the core, given that Cytoscape is targeted to biologists? Keeping biology out of the core helps us think a bit more generally about the functionality of the software.
Contents
Core data structures
This section provides a very brief summary of core data structures meant as an executive summary, not exhaustive documentation.
- Application level
CyNetwork - The network model which stores CyNode and CyEdge objects. The data structure is a "mixed directed/undirected compound multigraph". This data structure has the following properties:
Is a set of nodes (CyNode)
- Nodes are currently identified by a unique string
Is a set of edges (CyEdge)
- Edges connect 2 nodes (source and target)
- Edges can be directed or not (boolean flag)
- Edges are currently identified by a unique string
- Cytoscape (in the Cytoscape class) imposes an additional constraint that the identifier must be named "node1 (edgeType) node2".
- Multiple edges between nodes are allowed, though they must have different identifiers (this is a multigraph)
- Nodes can contain other nodes and optionally edges. If a node contains other nodes and edges, it is called a metanode. The metanode concept is not actually modeled, rather a node may have children nodes/edges or parent nodes and an edge may have parent nodes. These concepts are called metachildren or metaparents, but are generally not referred to.
Many CyNetwork instances can exist in Cytoscape
CyNetworkView - The view which is responsible for drawing a CyNetwork to e.g. the screen, an image file. There is currently one view per CyNetwork and this view mirrors the CyNetwork perfectly, so if a CyNode exists in a CyNetwork, it will be drawn in the CyNetworkView.
Each view contains a CyNodeView and a CyEdgeView for each node/edge in the CyNetwork model.
CyAttributes contains a multi hash map, which stores node and edge attributes. This acts like an in memory database keyed by a unique string (e.g. the CyNode or CyEdge string identifier) and can store the following types as values:
- Simple mode: Boolean, Integer, Double, String
- Advanced mode: List, Map
Visual mapper - a map from data (node/edge) attributes to visual attributes
VisualStyle - encapsulates all calculators in a visual style
NodeAppearanceCalculator - calculates the appearance of a Node
EdgeAppearanceCalculator - calculates the appearance of an Edge
- TODO: There are other data structures here
- Libary level (should be hidden from application level)
GINY - http://csbi.sourceforge.net/ (implemented as FING)
Piccolo - http://www.cs.umd.edu/hcil/jazz/
Refactoring status
- Needs refactoring, community review process (as of Feb.2006):
CyNetwork is still at version 1
- General graph data model must be revisited
- Metanodes should probably be moved to their own module
- Edge labels should be refactored/revisited. Brackets in the labels are not optimal.
- General graph data model must be revisited
Library level needs to be hidden from application level. Currently, e.g. GINY is available for use in Cytoscape, but this confuses the use of CyNetwork. We need to encapsulate all libary objects within Cytoscape application layer objects.
Visual mapper is at version 3 (version 1 in Cytoscape 1.0, rewritten by Trey's group to current GUI, refactored by Ethan to current state), but has not been community reviewed.
- Has been refactored, passed community review process (as of Feb.2006)
CyAttributes is at version 2 (previously called GraphObjAttributes). Reviewed in RFC_1.
Miscellaneous data structures
This section briefly describes data structures that are part of the Cytoscape application, but are not considered core concepts. This generally means that the typical developer does not need to know about them because they are useful mainly for internal Cytoscape use or are planned to be removed.
ExpressionData - stores data read by from an expression data file. Can also be used to access this data, but the data is also copied to CyAttributes. It would be best if this data is only stored in CyAttributes and a utility class was available for dealing with expression type data in a general way.
BioDataServer - stores gene annotation and synonym data
- Thesaurus - a hashmap of synonyms, used for node unique identifier equivalence mapping
- Ontology - a directed acyclic graph used to store ontologies like the Gene Ontology (GO)
- Annotation - a mapping from a string identifier to an ontology term in the Ontology object
Filters - TODO: describe
Refactoring status
- Needs refactoring, community review process (as of Feb.2006):
ExpressionData - Needs to be eventually replaced by CyAttributes (though some utility methods for expression data, or more generally any type of data like expression data, would be useful, this should not be a core class). E.g. the expression data file import/export features may need to use some features from this class.
BioDataServer - Needs to be refactored or rewritten
- Has been refactored, passed community review process (as of Feb.2006)
Cytoscape Core Subsystems
- Startup
- Initialization
- Plugin loading
- GUI
- Menu set up
- Node/edge selection
- Layout
- Undo/redo
- Visual mapper
- Help
- Annotation/ontology viewer
- Load/Save (to new .cys format)
- Import/Export
- SIF import/export
- GML import/export
Expression data import (currently in the ExpressionData class, but should be a reader). This also may need a writer?
- Node attributes import/export
- Edge attributes import/export
BioDataServer (needs a rewrite)
- Ontology reading
- Old manifest file format reader
- New OBO format reader
- Synonym reading
- Ontology reading
- Events
Refactoring Status
- It would be nice to provide an API for registering import/export plugins and migrate all core import/export code to core plugins using this API. This would help keep the core of Cytoscape small, thus reducing complexity.
- The event system needs to be revisited to evaluate consistency across the core
Cytoscape Core Plugin Systems
- GUI
- Cytoscape node/edge attributes graphical browser (browser.jar)
- Filters (filter.jar)
Network editor (CytoscapeEditor.jar
- Align/distribute nodes (control.jar)
- Right-click menu on view (yeast-context.jar)
- Layout
Hierarchical layout (HierarchicalLayout.jar)
- yFiles layout (yLayouts.jar)
Many new systems are needed, as described in Future_Cytoscape_Features
Wiki page TODO
- General notes about design philosophy - when in doubt, keep it simple
- Add note about model vs. view
- Define Core as per recent discussion with Trey
- Add note about ideal package structure (more modular, logical groupings)
- Add note about ideal file structure (recent proposal by Mike Smoot about plugin reorganization)
- Where should optional libraries go?
- Encapsulate application-external data structures rather than extend
- No bio semantics in core
- Documentation: how to move plugins from beta/user directory to the core as a core plugin. How to move a beta plugin into the core (how a decision gets made).
- document how plugins should add themselves to menus
- Create a glossary to normalize names and concepts
- Add section on testing - unit tests and swing unit tests must be part of the feature set that we work on. Build system needs to be more automated to include all core plugins.