Differences between revisions 33 and 34

(Under construction!!)

Integrate Networks and Annotation by Web Services

Introduction

From version 2.6, Cytoscape works as a web service client for public biological databases. In this tutorial, you will learn how to use Cytoscape as a data integration platform using public databases.

What is a Web Service?

Web Service is a standardized mechanism for computers to exchange data. These days, there are lots of public biological databases accessible over the Internet. Many of them start supporting web services and accessible from client programs. This means you can search and retrieve interactions and annotations directly from client programs. Cytoscape works as a web service client from 2.6, so you can access those databases directly from your Cytoscape Desktop.

Goal of this Tutorial

You can learn the following functions of Cytoscape from this tutorial:

How to import networks from public databases
How to import annotations and map IDs
Merge networks from multiple data sources
Map known pathways onto interaction networks

This is a fairly complicated tutorial to use multiple plugins and multiple data sources, so I assume you already know basics of Cytoscape. If not, please finish the basic tutorials first.

Tutorial: Integrate Known Information about PPAR-Gamma

Setup

To do this tutorial, you need to install the following plugins:

Data Merge

AdvancedNetworkMerge Plugin

Scripting

RubyScriptingEngine Plugin
ScriptingEngineManager Plugin

Search

Enhanced Search Plugin

Note: Out of Memory Problem

When you load a lot of plugins at once, sometimes Cytoscape crashes even if you have a lot of memory in your machine. This is because Java heap called Permanent region is full. To avoid this problem, you need to edit the following file:

# for Mac/Linux
cytoscape.sh

# for Windows
cytoscape.bat

In the file, you can see the following options:

 -Xss5M -Xmx1024M -XX:MaxPermSize=128m

In general, if you increase -XX:MaxPermSize, you can load many plugins at once. Default size is 64M, so probably 128M is enough in most cases.

Tutorial 1: Search by Keyword

In this exercise, you are going to learn how to search interactions by keyword.

Import Interactions and Annotation from NCBI Entrez Gene

Select File-->Import-->Network from Web Services...
Set Data Source to NCBI Entrez EUtilities Web Service Client
In the Query window, type pparg AND human[ORGN]. This query means search Entrez Gene database by PPARG for human.
Press Search button. This process takes a while.
When the client finds matched entries, it pops up a confirmation dialog. Press Yes to proceed.
You will be asked to type network name. Type PPAR-Gamma from NCBI.
Select Layout-->yFiles-->Organic and apply the layout.
In the main desktop, you can see a network generated from the NCBI Entrez Gene data sets. Entrez Gene stores interaction data from three databases: BIND, BioGRID, and HPRD. Edge color represents source of the interaction data. attachment:ncbi1.png attachment:ncbi2.png
Select File-->Import-->Import Attributes from NCBI Entrez Gene
Make Sure Attribute is set to ID and check all attributes on the list attachment:ncbi3.png
Press Import. This takes several minutes depends on network speed
Now you have a network annotated with Entrez Gene database attachment:ncbi35.png

Import Known Pathways and Interactions from Pathway Commons

Use Pathway Commons and search PPARG for human attachment:ncbi4.png
Load all pathways and merge them into one big network

Import Binary Interactions from IntAct

Select File-->Import-->Network from Web Services...
Set Data Source to IntAct Web Service Client
In the Query box, type PPARG AND species:human
Press Search
At this point, IntAct client only imports direct interactions to PPAR-Gamma.
On the new network view, select all nodes. Then right-click one of the selected nodes and select Use Web Services-->IntAct Web Service Cleint-->Get neighbours by ID(s) (see the screenshot below)
- attachment:intact1.png
Now the network includes nodes within two hops from PPAR-Gamma.
attachment:intact2.png
At this point, your workspace should look like the following (use View-->Arrange Network Windows-->Tiled to arrange network views):
- attachment:intact3.png

Import KEGG Pathway using BioRuby

There are several options to import KEGG pathways, but none of them are complete due to the complex pathway data structure. At this point, the following is the most complete solution for importing KEGG Pathways.

Download the following [http://www.ruby-lang.org/en/ Ruby] script to your current working directory. attachment:kegg_relation_mapper_for_bioruby_console.rb

Select Plugins-->Scripting Language Consoles-->Open Ruby Console. This command initializes [http://bioruby.org/ BioRuby] Console and takes several moments
Check the location of your script file. cd to the location if necessary. attachment:kegg1.png

Search the pathway related to PPAR-Gamma. From the console, type:
keggapi.bfind('pathway pparg human')

This command invokes BioRuby's KEGG API, and it takes a while to be initialized. The command means search KEGG Pathway database using keyword pparg and human. The result should look like the following:

bioruby> keggapi.bfind('pathway pparg human')
JRuby limited openssl loaded. gem install jruby-openssl for full support.
http://wiki.jruby.org/wiki/JRuby_Builtin_OpenSSL
  ==> "path:hsa03320 PPAR signaling pathway - Homo sapiens (human); Peroxisome proliferator-activated receptors (PPARs) are nuclear hormone receptors that are activated by fatty acids and their derivatives. PPAR has three subtypes (PPARalpha, beta/delta, and gamma) showing different expression patterns in vertebrates. Each of them is encoded in a separate gene and binds fatty acids and eicosanoids. PPARalpha plays a role in the clearance of circulating or cellular lipids via the regulation of gene expression involved in lipid metabolism in liver and skeletal muscle. PPARbeta/delta is involved in lipid oxidation and cell proliferation. PPARgamma promotes adipocyte differentiation to enhance blood glucose uptake.\n"
bioruby>

Now we found a pathway path:hsa03320 PPAR signaling pathway in KEGG related to PPAR-Gamma gene. Let's Import this pathway to Cytoscape.

Type:
pathway_id = 'path:hsa03320'
Then run the script:
source 'YOUR_SCRIPT_NAME'
If the script is not in your current working directory, you need to use the full path. After few moments, you can see the following relation diagram of genes on pathway path:hsa03320 (custom VizMapper applied to the following screenshot). attachment:kegg2.png
This script creates two additional attributes KEGG ID and Entrez Gene ID. You can use them for node labels to make the diagram more meaningful. attachment:kegg3.png

Import Pathways from WikiPathways

[http://www.wikipathways.org/index.php/WikiPathways WikiPathways] is a database of curated pathways using Wiki-style interface. Pathway data files are available as GPML format (standard data file format for [http://www.genmapp.org/default.html GenMAPP]) and they are readable in Cytoscape. In addition, GPML plugin supports direct pathway import from WikiPathways using Cytoscape's web service client framework.

Install [http://www.pathvisio.org/Cytoscape_plugin GPML Plugin]. You can install it from Plugin Manager.
Select File-->Import-->Network from Web Services...
Set Data Source to WikiPathways Web Service Client

Extract Interactions from Publications by Agilent Literature Search

Please try [:Presentations/05_Literature:Literature Searching] tutorial first. In this section, you will learn how to add extra annotations to the generated network by Agilent Literature Search.

Import Interactions from MiMI Database

Tutorial 2: Start from List of Genes

Suppose you have a list of genes and you want to see known interactions of those genes in Cytoscape. In this section, you will learn how to

Import Known Interaction of Genes from Entrez Gene Database

Prepare list of genes. In this example, we are going to use the following:
10062 10580 10998 10999 11001 116519 126129 1374 1375 1376 1579 1581 1582 1593 1622 1962 2167 2168 2169 2170 2171 2172 2173 2180 2181 2182 23305 2710 2712 284541 28965 30 3158 33 335 336 34 345 3611 364 376497 4023 4199 4312 4973 51 5105 5106 51129 5170 51703 5346 5360 5465 5467 5468 6256 6257 6258 6319 6342 642956 7316 7350 8309 8310 9370 9415 948
In this case, ID set is Entrez Gene ID. You can use other ID sets as a query, but if you use Entrez Gene ID, you can minimize the search time.

Select File-->Import-->Network from Web Service
Set Data Source to NCBI Entrez EUtilities Web Service Client
Paste the gene ID list to the Query box
Press Search. Again, this process takes several minutes (depends on network status)
Name the network and press OK.
After applying Organic layout, the network looks like the following:
- attachment:ncbi_gene_list1.png
Next, import annotation for them. It is same as the protocol described in the first section.
When import is done, annotations about the genes are imported like the following:
attachment:ncbi_gene_list2.png
You can check the location of the genes you entered as the query by using Enhanced Search plugin. Paste the list of gene IDs in the search ESP window on the toolbar and press enter. Genes in the original query will he selected.
attachment:ncbi_gene_list3.png attachment:ncbi_gene_list4-1.png

Mark Original Nodes

In some cases, it is useful to remember those genes as the origin of this interaction network. This is especially useful when you merge multiple networks. You can do it by using Attribute Browser's functions.

Assume nodes in the original query are already selected.

In the Node Attribute Browser window, you can see the icon to create new attribute. Press the icon
Select String Attribute
Name the new attribute.
On the right side of the Browser, you can see an icon called Batch Attribute Editor. Press the icon
In the Operation tab, select Set and then select the attribute name you created from the combo box.
Type the value. In this example, I use query1 as the attribute value
- attachment:ncbi_gene_list4.png
Press Go. New attributes are set to the selected nodes. Close the window.
Now you can use it in the Visual Style to see the nodes in the original list more intuitively. The following is an example to use the new attribute to control node size, shape, and color.
attachment:ncbi_gene_list5.png attachment:ncbi_gene_list6.png

Tutorial 3: Merge Multiple Networks

attachment:merge_final.png

Example visualization of the integrated network. All networks are merged, and big red nodes are genes on PPAR signaling pathway (path:hsa03320) in KEGG. PPAR-Gamma is selected.

Optional

How to get list of genes on a specific pathway

This is a bit out of focus of this protocol, but here is how I got the original list of genes. To do the following, you need to install RubyScriptingEngine Plugin.
Here is how:
Open the BioRuby Console and get list of genes by using KEGG API
. . . B i o R u b y i n t h e s h e l l . . . Version : BioRuby 1.2.1 / Ruby 1.8.6 bioruby>
Get list of genes for KEGG Pathway mmu03320 (PPAR Signaling Pathway)
bioruby> gene_list = keggapi.get_genes_by_pathway("path:mmu03320") JRuby limited openssl loaded. gem install jruby-openssl for full support. http://wiki.jruby.org/wiki/JRuby_Builtin_OpenSSL ==> ["mmu:103968", "mmu:104086", "mmu:108078", "mmu:11363", "mmu:11364", "mmu:113868", "mmu:11430", "mmu:11450", "mmu:11770", "mmu:11806", "mmu:11807", "mmu:11814", "mmu:11832", "mmu:12140", "mmu:12491", "mmu:12894", "mmu:12895", "mmu:12896", "mmu:13117", "mmu:13118", "mmu:13119", "mmu:13122", "mmu:13124", "mmu:13167", "mmu:14077", "mmu:14079", "mmu:14080", "mmu:14081", "mmu:14626", "mmu:14933", "mmu:15360", "mmu:16202", "mmu:16204", "mmu:16592", "mmu:16956", "mmu:17436", "mmu:18534", "mmu:18607", "mmu:18830", "mmu:19013", "mmu:19015", "mmu:19016", "mmu:20181", "mmu:20182", "mmu:20183", "mmu:20249", "mmu:20250", "mmu:20280", "mmu:20411", "mmu:216739", "mmu:22190", "mmu:22227", "mmu:22259", "mmu:225579", "mmu:235674", "mmu:26457", "mmu:26458", "mmu:26459", "mmu:26569", "mmu:30049", "mmu:433256", "mmu:50790", "mmu:56473", "mmu:57875", "mmu:622384", "mmu:66113", "mmu:669888", "mmu:74147", "mmu:74205", "mmu:74551", "mmu:78070", "mmu:80911", "mmu:83995", "mmu:83996", "mmu:93732"]
Remove prefix and join them to a one query string. Since KEGG uses Entrez Gene ID as a part of their identifier, you can copy and paste the result as a list of Entrez Gene IDs.
bioruby> query = gene_list.join(" ").gsub(/mmu:/, "") ==> "103968 104086 108078 11363 11364 113868 11430 11450 11770 11806 11807 11814 11832 12140 12491 12894 12895 12896 13117 13118 13119 13122 13124 13167 14077 14079 14080 14081 14626 14933 15360 16202 16204 16592 16956 17436 18534 18607 18830 19013 19015 19016 20181 20182 20183 20249 20250 20280 20411 216739 22190 22227 22259 225579 235674 26457 26458 26459 26569 30049 433256 50790 56473 57875 622384 66113 669888 74147 74205 74551 78070 80911 83995 83996 93732" bioruby>
By learning simple BioRuby commands, you can access lots of functions to access KEGG and other databases. For more information, please visit [http://bioruby.org/ BioRuby Web Site].

Presentations/08_Web_Services (last edited 2009-04-08 17:07:33 by KeiichiroOno)

-  ← Revision 33 as of 2008-10-14 00:12:04 →
  Size: 14011
  Editor: KeiichiroOno
  Comment:
+  ← Revision 34 as of 2008-10-14 00:32:56 →
  Size: 14804
  Editor: KeiichiroOno
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 27:
+=== Network Import Clients ===
-Line 34:
+Line 35:
-. MiMI plugin
+. MiMI Plugin

 1. GPML Plugin



=== Data Merge ===
-Line 36:
+Line 40:
+=== Scripting ===
-Line 38:
+Line 44:
+=== Search ===
-Line 167:
+Line 175:
+[http://www.wikipathways.org/index.php/WikiPathways WikiPathways] is a database of curated pathways using Wiki-style interface.  Pathway data files are available as '''''GPML''''' format (standard data file format for [http://www.genmapp.org/default.html GenMAPP]) and they are readable in Cytoscape.  In addition, GPML plugin supports direct pathway import from WikiPathways using Cytoscape's web service client framework.



 1. Install [http://www.pathvisio.org/Cytoscape_plugin GPML Plugin].  You can install it from Plugin Manager.

 1. Select '''''File-->Import-->Network from Web Services...'''''

 1. Set ''Data Source'' to '''''WikiPathways Web Service Client'''''

 1.

Diff for "Presentations/08_Web_Services"