(Under construction!!)

Introduction

From version 2.6, Cytoscape works as a web service client for public biological databases. In this tutorial, you will learn how to use Cytoscape as a data integration platform using public databases.

What is a Web Service?

Web Service is a standardized mechanism for computers to exchange data. These days, there are lots of public biological databases accessible over the Internet. Many of them start supporting web services and accessible from client programs. This means you can search and retrieve interactions and annotations directly from client programs. Cytoscape works as a web service client from 2.6, so you can access those databases directly from your Cytoscape Desktop.

Goal of this Tutorial

You can learn the following functions of Cytoscape from this tutorial:

How to import networks from public databases
How to import annotations and map IDs
Merge networks from multiple data sources
Map known pathways onto interaction networks

This is a fairly complicated tutorial to use multiple plugins and multiple data sources, so I assume you already know basics of Cytoscape. If not, please finish the basic tutorials first.

Tutorial: Integrate Known Information about PPAR-Gamma

Setup

To do this tutorial, you need to install the following plugins:

Pathway Commons Plugin (installed by default)
NCBIClient Plugin
NCBIEntrezGeneUserInterface Plugin
IntActWSClient Plugin
BiomartClient Plugin
BiomartUserInterface Plugin
AgilentLiteratureSearch Plugin
MiMI plugin
AdvancedNetworkMerge Plugin
RubyScriptingEngine Plugin
ScriptingEngineManager Plugin
Enhanced Search Plugin

Note: Out of Memory Problem

When you load a lot of plugins at once, sometimes Cytoscape crashes even if you have a lot of memory in your machine. This is because Java heap called Permanent region is full. To avoid this problem, you need to edit the following file:

# for Mac/Linux
cytoscape.sh

# for Windows
cytoscape.bat

In the file, you can see the following options:

 -Xss5M -Xmx1024M -XX:MaxPermSize=128m

In general, if you increase -XX:MaxPermSize, you can load many plugins at once. Default size is 64M, so probably 128M is enough in most cases.

Tutorial 1: Search by Keyword

In this exercise, you are going to learn how to search interactions by keyword.

Import Interactions and Annotation from NCBI Entrez Gene

Select File-->Import-->Network from Web Services...
Set Data Source to NCBI Entrez EUtilities Web Service Client
In the Query window, type pparg AND human[ORGN]. This query means search Entrez Gene database by PPARG for human.
Press Search button. This process takes a while.
When the client finds matched entries, it pops up a confirmation dialog. Press Yes to proceed.
You will be asked to type network name. Type PPAR-Gamma from NCBI.
Select Layout-->yFiles-->Organic and apply the layout.
In the main desktop, you can see a network generated from the NCBI Entrez Gene data sets. Entrez Gene stores interaction data from three databases: BIND, BioGRID, and HPRD. Edge color represents source of the interaction data. attachment:ncbi1.png attachment:ncbi2.png
Select File-->Import-->Import Attributes from NCBI Entrez Gene
Make Sure Attribute is set to ID and check all attributes on the list attachment:ncbi3.png
Press Import. This takes several minutes depends on network speed
Now you have a network annotated with Entrez Gene database attachment:ncbi35.png

Import Known Pathways and Interactions from Pathway Commons

Use Pathway Commons and search PPARG for human attachment:ncbi4.png
Load all pathways and merge them into one big network

Import Binary Interactions from IntAct

Select File-->Import-->Network from Web Services...
Set Data Source to IntAct Web Service Client
In the Query box, type PPARG AND species:human
Press Search
At this point, IntAct client only imports direct interactions to PPAR-Gamma.
On the new network view, select all nodes. Then right-click one of the selected nodes and select Use Web Services-->IntAct Web Service Cleint-->Get neighbours by ID(s) (see the screenshot below)
- attachment:intact1.png
Now the network includes nodes within two hops from PPAR-Gamma.
attachment:intact2.png
At this point, your workspace should look like the following (use View-->Arrange Network Windows-->Tiled to arrange network views):
- attachment:intact3.png

Import KEGG Pathway using BioRuby

There are several options to import KEGG pathways, but none of them are complete due to the complex pathway data structure. At this point, the following is the most complete solution for importing KEGG Pathways.

Download the following [http://www.ruby-lang.org/en/ Ruby] script to your current working directory. attachment:kegg_relation_mapper_for_bioruby_console.rb

Select Plugins-->Scripting Language Consoles-->Open Ruby Console. This command initializes [http://bioruby.org/ BioRuby] Console and takes several moments
Check the location of your script file. cd to the location if necessary. attachment:kegg1.png

Search the pathway related to PPAR-Gamma. From the console, type:
keggapi.bfind('pathway pparg human')

This command invokes BioRuby's KEGG API, and it takes a while to be initialized. The command means search KEGG Pathway database using keyword pparg and human. The result should look like the following:

bioruby> keggapi.bfind('pathway pparg human')
JRuby limited openssl loaded. gem install jruby-openssl for full support.
http://wiki.jruby.org/wiki/JRuby_Builtin_OpenSSL
  ==> "path:hsa03320 PPAR signaling pathway - Homo sapiens (human); Peroxisome proliferator-activated receptors (PPARs) are nuclear hormone receptors that are activated by fatty acids and their derivatives. PPAR has three subtypes (PPARalpha, beta/delta, and gamma) showing different expression patterns in vertebrates. Each of them is encoded in a separate gene and binds fatty acids and eicosanoids. PPARalpha plays a role in the clearance of circulating or cellular lipids via the regulation of gene expression involved in lipid metabolism in liver and skeletal muscle. PPARbeta/delta is involved in lipid oxidation and cell proliferation. PPARgamma promotes adipocyte differentiation to enhance blood glucose uptake.\n"
bioruby>

Now we found a pathway path:hsa03320 PPAR signaling pathway in KEGG related to PPAR-Gamma gene. Let's Import this pathway to Cytoscape.

Type:
pathway_id = 'path:hsa03320'
Then run the script:
source 'YOUR_SCRIPT_NAME'
If the script is not in your current working directory, you need to use the full path. After few moments, you can see the following relation diagram of genes on pathway path:hsa03320 (custom VizMapper applied to the following screenshot). attachment:kegg2.png
This script creates two additional attributes KEGG ID and Entrez Gene ID. You can use them for node labels to make the diagram more meaningful. attachment:kegg3.png

Tutorial 2: Start from List of Genes

Tutorial 3: Merge Multiple Networks

== Tutorial 4:

Optional

How did I get the list of genes on a specific pathway?

This is a bit out of focus of this protocol, but here is how I got the original list of genes. To do the following, you need to install RubyScriptingEngine Plugin.
Here is how:
Open the BioRuby Console and get list of genes by using KEGG API
# First, get list of genes for KEGG Pathway mmu03320 (PPAR Signaling Pathway) . . . B i o R u b y i n t h e s h e l l . . . Version : BioRuby 1.2.1 / Ruby 1.8.6 bioruby> gene_list = keggapi.get_genes_by_pathway("path:mmu03320") JRuby limited openssl loaded. gem install jruby-openssl for full support. http://wiki.jruby.org/wiki/JRuby_Builtin_OpenSSL ==> ["mmu:103968", "mmu:104086", "mmu:108078", "mmu:11363", "mmu:11364", "mmu:113868", "mmu:11430", "mmu:11450", "mmu:11770", "mmu:11806", "mmu:11807", "mmu:11814", "mmu:11832", "mmu:12140", "mmu:12491", "mmu:12894", "mmu:12895", "mmu:12896", "mmu:13117", "mmu:13118", "mmu:13119", "mmu:13122", "mmu:13124", "mmu:13167", "mmu:14077", "mmu:14079", "mmu:14080", "mmu:14081", "mmu:14626", "mmu:14933", "mmu:15360", "mmu:16202", "mmu:16204", "mmu:16592", "mmu:16956", "mmu:17436", "mmu:18534", "mmu:18607", "mmu:18830", "mmu:19013", "mmu:19015", "mmu:19016", "mmu:20181", "mmu:20182", "mmu:20183", "mmu:20249", "mmu:20250", "mmu:20280", "mmu:20411", "mmu:216739", "mmu:22190", "mmu:22227", "mmu:22259", "mmu:225579", "mmu:235674", "mmu:26457", "mmu:26458", "mmu:26459", "mmu:26569", "mmu:30049", "mmu:433256", "mmu:50790", "mmu:56473", "mmu:57875", "mmu:622384", "mmu:66113", "mmu:669888", "mmu:74147", "mmu:74205", "mmu:74551", "mmu:78070", "mmu:80911", "mmu:83995", "mmu:83996", "mmu:93732"] bioruby> query = gene_list.join(" ").gsub(/mmu:/, "") ==> "103968 104086 108078 11363 11364 113868 11430 11450 11770 11806 11807 11814 11832 12140 12491 12894 12895 12896 13117 13118 13119 13122 13124 13167 14077 14079 14080 14081 14626 14933 15360 16202 16204 16592 16956 17436 18534 18607 18830 19013 19015 19016 20181 20182 20183 20249 20250 20280 20411 216739 22190 22227 22259 225579 235674 26457 26458 26459 26569 30049 433256 50790 56473 57875 622384 66113 669888 74147 74205 74551 78070 80911 83995 83996 93732" bioruby>
By learning simple BioRuby commands, you can access lots of functions to access KEGG and other databases. For more information, please visit:

Presentations/08_Web_Services