Cytoscape_3/UserManual/Attributes

Interaction networks are useful as stand-alone models. However, they are most powerful for answering scientific questions when integrated with additional information.

Cytoscape allows the user to add arbitrary node, edge and network data to Cytoscape through a data table. This could include, for example, expression data of a gene or confidence values in a protein-protein interaction. In the data table, information is linked to nodes, edges or networks by mapping the columns to one of their identifiers. Through the Table Panel the values can be further manipulated through the use of column functions and equations

A second type of data that can be associated with networks is ontology data: organized sets of controlled vocabulary terms. Because this type of data is mostly hierarchically organized, this requires a special importing facility, described in the second part of this section.

Data associated with the network elements can be visualized in a user-defined way by setting up a mapping from data in the columns to network properties (colors, shapes, and so on). The section on styles discusses this in greater detail.

Table data

Cytoscape supports several tabular formats. Of course the regular text and table formats: .tsv, .tab, .csv, .txt (comma, tab or any delimiter separated values file), .xls, .xlsx (Microsoft Excel file format). Note that for Excel file formats, only the first sheet of a workbook is currently imported. The legacy native Cytoscape formats: .attrs and .pvals (Cytoscape expression matrix) are also supported (for a more thorough discussion of these formats consult the 2.x manual). For most users the regular data table importing functionality will be sufficient; in case a format is unknown, renaming the file to a .txt extension and experimenting with delimiter and header settings in the import interface will work in most cases.

Importing data

Basic table file

The basic file format consists of a table containing at least one column with identifiers (unique names) for the nodes, edges or networks, and one or (any number) more columns with data you want to associate with these network elements.

Sample Data Table 1

Yeast Key	Degree
YER054C	85
YBL079W	7
YLR345W	1

To import such a file:

Select File → Import → Table → File...( or URL... if your source data file is accessible through the internet)
Select a data file in the file chooser panel (or enter the URL in the displayed box). This file can be in any of the accepted formats mentioned above.
In the Import Columns from Table panel, select the Importing Type. Cytoscape can import table columns to node, edge, and network table columns.
(Optional) Choose if you would like to import the file for all of the available networks or only selected networks using the check box in the expandable Network Options" panel (this panel is collapsed by default). Select networks from the list.
(Optional) If the table is not properly delimited in the preview panel, change the delimiter in the Text File Import Options panel (the default delimiter is Tab).
By default, the first column is designated as the primary key. Change the key column if necessary, see below for an example.
The data attribute column name can be changed by right clicking on the header. A dialog appears that enables specification of the datatype and column name.
Left clicking selects or deselects the column for importing.
Click the Import button.

The user interface of the "Import Column from Table" window is similar to that of the "Import Network from Table" window.

If the data only relates to a specific network select the 'Apply to selected networks only' box and select the specific network from the list (click 'Select Networks' to show this list)

Mapping data options

In Cytoscape 3.0 data columns with primitive data types (string, boolean, floating point, and integer) can be selected as the key column using the dropdown list provided. Complex data attributes such as lists are not supported as keys.

Text file import options

When the text file import options box is checked, several additional parameters can be selected to tune the way the data file will be imported. The first line can be used to specify column header names. Another data delimiter (default is tab) can be chosen, and the comment prefix (signifying lines to be ignored) can be defined.

Import from Public Databases

You can import various kinds of ID sets from the BioMart (http://www.biomart.org/index.html) public database. The BioMart web service client is integrated into Cytoscape.

Select: File → Import → Table → Public Databases...
Select a data source. For ID mapping, select one of the Ensemble Genes data set. You need to choose the correct species for your network.
Select Attribute. If you want to import new ID sets matching current node IDs, select shared name.
Select Data Type. This should be the type of ID set selected in Attribute list. For example, if you select shared name for Attribute and your network uses Entrez Gene ID for its node ID, you need to select EntrezGene ID(s) for Data Type.
Select new ID sets from the list. Because there is a size limitation on the results returned by the!BioMart server, you can select only 3-5 attributes for each import.
Press Import.

Data in complete networks

When importing a network from a table, columns other than the node identifier can be imported as data also. For more detail on these options, please see the "Import Free-Format Table Files" section of the user manual in the Creating Networks chapter.

Table Panel

When Cytoscape is started, the Table Panel appears in the bottom CytoPanel. This panel can be hidden and restored using the View → Show/Hide Table Panel menu option. Like other CytoPanels, the panel can be undocked by pressing the little icon in the panel’s top right corner.
To swap between displaying node, edge, and network tables use the tabs on the bottom of the panel labeled "Node Table", "Edge Table", and "Network Table".
In Cytoscape 3.0 there are two display modes for the table: show selected nodes/edges only and show all rows. This configuration can be set using button (the left most) in the figure. The Table Panel displays data belonging to the currently selected network.
Using the three buttons (left 2nd to 4th) in the figure, it is possible to make some or all columns visible and hide others or all of them. Also, a new column can be created by pressing button the (left 5th) or mutable columns can be deleted by button (left 6th). Button f(x) is for writing equations to manipulate the data which is further explained in the section attribute functions and equations.
Most data values can be edited by double-clicking on their table cell (only the SUID cannot be edited). Newline characters can be inserted into String attributes either by pressing Enter or by typing "\n". Once finished editing, click outside of the editing cell in the Table Panel or press Shift-Enter to save your edits. Pressing Esc while editing will undo any changes.

Table rows in the browser can be sorted by a specific column by clicking on a column heading. A new column can be created using the Create New Column button (left 5th), and must be one of four types – integer, string, real number (floating point), or boolean. Attributes can be deleted using the Delete Attributes button (left 6th, trash can icon). NOTE: Deleting attributes removes them from Cytoscape, not just the attribute browser! To remove attributes from the browser without deleting them, simply unselect the attribute using the Select Column button (left 3rd).

The right-click menu on the Table Panel has several functions, such as exporting attribute information to spreadsheet applications. For example, use the right-click menu to select the data and copy for use in a spreadsheet application.

Ontologies

Another type of data that can be associated with network elements are ontologies. An ontology consists of an organized set of controlled vocabulary terms that annotate the objects. Most ontologies in science are organized in a hierarchical way. In biology for example, using the Gene Ontology, the Saccharomyces Cerevisiae CDC55 gene has a biological process described as “protein biosynthesis”, to which GO has assigned the number 6412 (a GO ID).

GO 8150 biological_process
 GO 7582 physiological processes
   GO 8152 metabolism
    GO 44238 primary metabolism
      GO 19538 protein metabolism
        GO 6412 protein biosynthesis

Graphical View of GO Term 6412: protein biosynthesis

Cytoscape can use this ontology DAG (Directed Acyclic Graph) to annotate objects in networks. The Ontology Server is a Cytoscape feature which allows you to load, navigate, and associate ontology terms to nodes and edges in a network. Cytoscape has an GUI for loading ontology and associating it with the network elements, enabling you to load both local and remote files.

Ontology and Association File Format

The standard file formats used in Cytoscape Ontology Server are OBO and Gene Association. The GO website details these file formats:
Ontologies and Definitions: http://www.geneontology.org/GO.downloads.shtml#ont
Current Associtations: http://www.geneontology.org/GO.current.annotations.shtml

OBO File

An OBO file is the ontology DAG itself. This file defines the relationships between ontology terms. Cytoscape can load all ontology files written in OBO format. The full listing of ontology files are available from the Open Biomedical Ontologies (OBO) website:
OBO Ontology Browser: http://obo.sourceforge.net/browse.html
Sample OBO File - gene_ontology.obo: http://www.geneontology.org/ontology/gene_ontology_edit.obo
format-version: 1.2 date: 27:11:2006 17:12 saved-by: midori auto-generated-by: OBO-Edit 1.002 subsetdef: goslim_generic "Generic GO slim" subsetdef: goslim_goa "GOA and proteome slim" subsetdef: goslim_plant "Plant GO slim" subsetdef: goslim_yeast "Yeast GO slim" subsetdef: gosubset_prok "Prokaryotic GO subset" default-namespace: gene_ontology remark: cvs version: $Revision: 5.49 $ [Term] id: GO:0000001 name: mitochondrion inheritance namespace: biological_process def: "The distribution of mitochondria, including the mitochondrial genome, into daughter cells after mitosis or meiosis, mediated by interactions between mitochondria and the cytoskeleton." [GOC:mcc, PMID:10873824, PMID:11389764] synonym: "mitochondrial inheritance" EXACT [] is_a: GO:0048308 ! organelle inheritance is_a: GO:0048311 ! mitochondrion distribution [Term] id: GO:0000002 name: mitochondrial genome maintenance namespace: biological_process def: "The maintenance of the structure and integrity of the mitochondrial genome." [GOC:ai] is_a: GO:0007005 ! mitochondrion organization and biogenesis

Default List of Ontologies

Cytoscape provides a list of ontologies available in OBO format. If an Internet connection is available, Cytoscape will import ontology and associatation files directly from the remote source. The table below lists the included ontologies.

Ontology Name

Description

Gene Ontology Full

This data source contains a full-size GO DAG, which contains all GO terms. This OBO file is written in version 1.2 format.

Generic GO slim

A subset of general GO Terms, including higer-level terms only.

Yeast GO slim

A subset of GO Terms for annotating Yeast data sets maintained by SGD.

Molecule role (INOH Protein name/family name ontology)

A structured controlled vocabulary of concrete and abstract (generic) protein names. This ontology is a INOH pathway annotation ontology, one of a set of ontologies intended to be used in pathway data annotation to ease data integration. This ontology is used to annotate protein names, protein family names, and generic/concrete protein names in the INOH pathway data. INOH is part of the BioPAX working group.

Event (INOH pathway ontology)

A structured controlled vocabulary of pathway-centric biological processes. This ontology is a INOH pathway annotation ontology, one of a set of ontologies intended to be used in pathway data annotation to ease data integration. This ontology is used to annotate biological processes, pathways, and sub-pathways in the INOH pathway data. INOH is part of the BioPAX working group.

Protein-protein interaction

A structured controlled vocabulary for the annotation of experiments concerned with protein-protein interactions.

Pathway Ontology

The Pathway Ontology is a controlled vocabulary for pathways that provides standard terms for the annotation of gene products.

PATO

PATO is an ontology of phenotypic qualities, intended for use in a number of applications, primarily phenotype annotation. For more information, please visit the PATO wiki (http://www.bioontology.org/wiki/index.php/PATO:Main_Page).

Mouse pathology

The Mouse Pathology Ontology (MPATH) is an ontology for mutant mouse pathology. This is Version 1.

Human disease

This ontology is a comprehensive hierarchical controlled vocabulary for human disease representation. For more information, please visit the Disease Ontology website (http://diseaseontology.sourceforge.net/).

Although Cytoscape can import all kinds of ontologies in OBO format, association files are associated with specific ontologies. Therefore, you need to provide the correct ontology-specific association file to annotate nodes/edges/networks in Cytoscape. For example, while you can annotate human network data using the GO Full ontology with human Gene Association files, you cannot use a combination of the human Disease Ontology file and human Gene Association files, because the Gene Association file is only compatible with GO.

Visualize and Browse Ontology DAG (for Advanced Users)

Relationships between ontology terms are usually represented as Directed Acyclic Graphs (DAGs). This is a special case of a network (or graph), where nodes are ontology terms and edges are relationships between terms. Ontology data is stored in the same data structure as normal networks. This enables users and App writers to visualize, browse and manipulate ontology DAGs just like other networks. The following is an example of visualization of an ontology DAG (Generic GO Slim):

Every ontology term and relationship can have attributes. In the OBO files, ontology terms have optional fields such as definition, synonyms, comments, or cross-references. These fields will be imported as node attributes. To browse those attributes, please use the attribute browser (see the example below):

Note 1: Some ontologies have a lot of terms. For example, the full Gene Ontology contains more than 20,000 terms. If you need to save memory, you can remove this ontology DAG from Network Panel (right-click on the ontology name at the left-hand side of the screen and select Destroy Network).
Note 2: All ontology DAGs will be saved in the session file. To minimize the session file size, you can delete the Ontology DAG before saving session.

Gene Association File

The Gene Association (GA) file provides annotation only for the Gene Ontology. It is a species-specific association file for GO terms. Gene Association files will only work with Gene Ontologies and NOT others!

Sample Gene Association File (gene_association.sgd - association file for yeast):

SGD     S000003916      AAD10           GO:0006081      SGD_REF:S000042151|PMID:10572264        ISS             P       aryl-alcohol dehydrogenase (putative)        YJR155W gene    taxon:4932      20020902        SGD
SGD     S000005275      AAD14           GO:0008372      SGD_REF:S000069584      ND              C       aryl-alcohol dehydrogenase (putative)        YNL331C gene    taxon:4932      20010119        SGD

If you have a network file and an association file, they should have a common key to map attributes onto network data. If those two do not have a common key, you need to do an extra step to add a shared key by mapping the current key to a common key as described above (Node Name Mapping).

Import Gene Ontology and Gene Association Files

For convenience, Cytoscape has a list of URLs for commonly used ontology data and a complete set of Gene Association files. To import Gene Ontology and Gene Association files for the loaded networks, please follow these steps:

Important: All data sources in the preset list are remote URLs, meaning a network connection is required!

Step 1. Select an Annotation File

Select File → Import → Ontology and Annotation... to open the "Import Ontology and Annotation" window. From the Annotation dropdown list, select a gene association file for your network. For example, if you want to annotate the yeast network, select "Gene Association file for Saccharomyces cerevisiae".

Step 2. Select an Ontology File

Select an Ontology data (OBO file) from the Ontology dropdown list. If the file is not loaded yet, it will be shown in red. The first three files are Gene Ontology files. You can load other ontologies, but you need your own annotation file to annotate networks.

Step 3. Import the files

Once you click the Import button, Cytoscape will start loading OBO and Gene Association files from the remote sources. If you choose GO Full it may take a while since it is a large data file.

Cytoscape_3/UserManual/Attributes (last edited 2013-12-11 00:20:38 by KristinaHanspers)

Ontology Name	Description
Gene Ontology Full	This data source contains a full-size GO DAG, which contains all GO terms. This OBO file is written in version 1.2 format.
Generic GO slim	A subset of general GO Terms, including higer-level terms only.
Yeast GO slim	A subset of GO Terms for annotating Yeast data sets maintained by SGD.
Molecule role (INOH Protein name/family name ontology)	A structured controlled vocabulary of concrete and abstract (generic) protein names. This ontology is a INOH pathway annotation ontology, one of a set of ontologies intended to be used in pathway data annotation to ease data integration. This ontology is used to annotate protein names, protein family names, and generic/concrete protein names in the INOH pathway data. INOH is part of the BioPAX working group.
Event (INOH pathway ontology)	A structured controlled vocabulary of pathway-centric biological processes. This ontology is a INOH pathway annotation ontology, one of a set of ontologies intended to be used in pathway data annotation to ease data integration. This ontology is used to annotate biological processes, pathways, and sub-pathways in the INOH pathway data. INOH is part of the BioPAX working group.
Protein-protein interaction	A structured controlled vocabulary for the annotation of experiments concerned with protein-protein interactions.
Pathway Ontology	The Pathway Ontology is a controlled vocabulary for pathways that provides standard terms for the annotation of gene products.
PATO	PATO is an ontology of phenotypic qualities, intended for use in a number of applications, primarily phenotype annotation. For more information, please visit the PATO wiki (http://www.bioontology.org/wiki/index.php/PATO:Main_Page).
Mouse pathology	The Mouse Pathology Ontology (MPATH) is an ontology for mutant mouse pathology. This is Version 1.
Human disease	This ontology is a comprehensive hierarchical controlled vocabulary for human disease representation. For more information, please visit the Disease Ontology website (http://diseaseontology.sourceforge.net/).