RFC Name : Cytoscape Projects

Editor(s): ScooterMorris & SarahKillcoyne

Date: Oct 23, 2007

Status: Draft

Proposal

Extend the concept of sessions to a more flexible project based system where all of the data involved in a given project is transparent and accessible for altering or copying to another project. This would help biologists to work in the way they are used to, where one project may build off of another and use many of the same properties or data files. For example, a Project would "remember" that a network (saved as an XGMML file) was created by reading in a SIF file, and associating three node attribute files and two edge attribute files with it. The goal would be data provenance, not necessarily to be able to recreate the initial data sets. Moving forward, Projects could include logs (Cytoscape 3.0 logging), user created scripts/macros, workflows, etc. While the Projects panel might list all available projects, only one project would be opened at a time. As a result, a project could replace a Cytoscape Session as the main mechanism for saving state.

Background

Scientists are accustomed to thinking about their work in terms of projects where all of the various types of data and analysis they have done are stored together, but separately accessible for use in other work. Many pieces of open and proprietary software used by scientists use this concept now to help organize and share work (such as CPAS and SBEAMS). Cytoscape does nothing to help users track their data provenance. Where did these attributes come from? How was this network created? What steps (including what data sources used) were performed to expand the network? All of these are legitimate questions that Cytoscape provides no means of assisting users to remember.

Use Cases

Implementation Plan

There are two major parts to implementing a project system in Cytoscape.

Expand the definition of a session to include data provenance information

Both will be simplified through the proposed refactor of Cytoscape. In a relayered system the hooks into loaders in the IO layer can be added to track the data imported into a project from any source. Hooks added to the Application layer would allow for the association of scripts, macros or workflows with a particular project as well.

Visual representation of data provenance

This visual representation can be dealt with in the View and Application layers. The views on a network and information about the data in that network can be provided to the user in a format similar to that of the Eclipse IDE where projects contain trees of information that can include the type, date and path of any particular piece of data within the tree. This sort of system would make it easy to add new features to a project such as a log file that could function as a sort of laboratory notebook, recording the exectution and results of Cytoscape actions (including plugins) and notes recorded by the user.

Prior to relayering

Hooks into the current network and attribute loaders can be added. The current network view panel can be replaced by a Project panel that can initially list the network and attributes loaded and potentially the visual style currently applied. This panel can also start by listing the network, attributes and visual style. This particular work is not necessary to a 2.x release but could help prepare users for the switch.

Project Management

Project Timeline

This work will depend on the relayering of Cytoscape and can be started during the refactoring of packages (See Milestone 1).

Tasks and Milestones

Project Dependencies

Outline and projects that depend on this project, link to relevant RFC's and note at what point dependent projects could be started.

Issues

Comments

How to Comment

Edit the page and add your comments under the provided header. By adding your ideas to the Wiki directly, we can more easily organize everyone's ideas, and keep clear records. Be sure to include today's date and your name for each comment. Try to keep your comments as concrete and constructive as possible. For example, if you find a part of the RFC makes no sense, please say so, but don't stop there. Take the extra step and propose alternatives.

CytoscapeProject (last edited 2010-05-17 03:55:33 by GaryBader)

Funding for Cytoscape is provided by a federal grant from the U.S. National Institute of General Medical Sciences (NIGMS) of the Na tional Institutes of Health (NIH) under award number GM070743-01. Corporate funding is provided through a contract from Unilever PLC.

MoinMoin Appliance - Powered by TurnKey Linux