Use Case Name : Sub-Gene Data Visualization |
For Feature : Group-API |
Editors: Nathan Salomonis |
<<TableOfContents: execution failed [Argument "maxdepth" must be an integer value, not "[2]"] (see also the log)>>
Summary
We would like to be able to represent detailed sub-gene data in a second window after selecting a node from the parent graph. These below use cases apply to datasets where many different pieces of data (e.g. probesets or interaction partners) are available, each of which would be represented as distinct nodes. Examples include individual probes or replicate spots from a microarray dataset, data from distinct time-point comparisons, polymorphism data from whole genome SNP experiments*, exon and exon-junction specific data*, ChIP on chip experiment data* and interaction partners for a node in the parent network*. Examples with an asterisk are illustrated below.
Step-by-Step User Action
Associating Data for Sub-Gene Views
- To view data with specialized sub-gene views (e.g. SNP view with annotations), first requires a relational database containing at a minimum, associations to genes or proteins. The GenMAPP group is currently designing a database that will support specific ID systems for microarray platforms that assay for specific sub-gene entities. This database will also have to accommodate new relationships appended to the database by the user for unsupported ID systems. For generic case, only those relationships need to connect individual array IDs (or other elements) to a gene are required. For more specialized cases, such as viewing SNP annotations associated with an array ID, these additional annotations must be stored in the relational database.
- Once the user determines whether the primary ID from their data is supported in the gene database and a sub-gene view to select, the user will:
- upload their data using a specialized interface
- specify filters for node coloring
- choose the sub-gene visualization methods
Selecting a Sub-Gene View
- Open a network
- Select the gene database used when loading the raw data from a context menu.
- Select the user database with data and criterion from a context menu.
- Select the sub-gene view of interest from a context menu. Once selected, by default the right click option will activate the sub-gene view.
Visualizing Sub-Gene Data
- Right click on a node in the network. This will open a new window (child network) containing the sub-gene view.
- Select a node in the child network to view more detailed annotations provided from the gene database.
Visual Aides
Requirements for Cytoscape
In the most simple example, multiple array IDs associated with a gene, Cytoscape will need to sort (based on node name) and horizontally align the nodes in the child network. For many nodes, multiple rows will need to be created. In the more complex cases (see above illustration), annotations will need to be displayed as labels (e.g. for SNPs: where for that gene does the SNP occur, for ChIP on chip: what transcript factor binding matrices overlap with the probed region, for interaction partners: the source of the interaction). In the case of exon and exon-junction data, a graphical display could be shown above the exon level probe data (each node represents a probeset on the array, annotated in the gene database according to which annotated exon it overlaps with), where the graphical display of each exon and intron is sized according to the layout of these nodes. This same method of layout could be used to view exon level data in the context domains, by graphically displaying the domain regions and domain names, scaled to the corresponding nodes below. According to this model, all data for the original identifiers loaded are shown in the sub-gene view, rather than summarizing this information in a way that is biased based on existing annotations. This is important, since we don't want to make broad generalizations or conclusions for the user which may be incorrect.
- The Group API should support expanding into a new sub-network
- All of these are derived directly from the gene database, not manually entered (e.g., exon-probesets).
- Perhaps outside of the scope of the Group API, we will want to map attributes between children and parent nodes (e.g., averaging child attributes assigned to parent node attribute).
Importance
This use case is necessary for any dataset with more than one piece of information linked to a single gene level node. Specific sub-gene views provide biological context and annotations for the original sub-gene identifiers loaded.
Other Examples
Comments
AllanKuchinsky - 2006-11-27 08:11:45
Why the requirement that sub-gene information reside in a relational database? Shouldn't we be able to handle any data from a remote source that can be queried via a Web services interface and returns a result in a parsable XML format?