PAGE Tutorial

From Icbwiki

(Difference between revisions)
Jump to: navigation, search
Revision as of 16:11, 17 August 2010
Tas2019 (Talk | contribs)

← Previous diff
Revision as of 21:08, 29 May 2012
Ole2001 (Talk | contribs)

Next diff →
Line 19: Line 19:
export PAGEDIR=`pwd` (if that does not work try: setenv PAGEDIR `pwd`) export PAGEDIR=`pwd` (if that does not work try: setenv PAGEDIR `pwd`)
export PATH=$PATH:$PAGEDIR (here too, try setenv PATH $PATH:$PAGEDIR if export failed) export PATH=$PATH:$PAGEDIR (here too, try setenv PATH $PATH:$PAGEDIR if export failed)
 +
 +
* '''Running''' * '''Running'''

Revision as of 21:08, 29 May 2012

  • get PAGE from svn

From our subversion repository:

svn co --username=guest --password=email@email.com https://pbtech-vc.med.cornell.edu/public/svn/elementolab/PAGE/trunk PAGE/

(Elemento lab members should use their PBtech identifiers instead of the guest ones)


  • Or download it (less up-to-date version)
http://physiology.med.cornell.edu/faculty/elemento/lab/files/PAGE.zip
unzip PAGE.zip
  • To install
cd PAGE
make clean
make
export PAGEDIR=`pwd` (if that does not work try: setenv PAGEDIR `pwd`)
export PATH=$PATH:$PAGEDIR (here too, try setenv PATH $PATH:$PAGEDIR if export failed)


  • Running
page.pl --expfile=FILE --pathways=STR --exptype=[discrete|continuous] [ --cattypes=P --minr=0.0 ]

where

--expfile=FILE 

is an expression/profile file, same format as in FIRE, ie two columns, one for the gene name, one for the expression measures. Here is an example of a [discrete expression profile], here is a [continuous one]

--pathways=STR 

is species+annotation, most likely human_go_orf (uses gene names like TP53, and curated GO categories, no electronic annotation).

Additional sources of annotations:

--pathways=human_go_orf (all GO categories, including electronic annotation)
--pathways=biocarta
--pathways=kegg
--pathways=HPRD_interactions
--pathways=staudt_genesets (curated B-cell-related gene sets from Lou Staudt's lab)
--exptype=STR 

describes whether the expression profile is discrete or continuous.

--cattypes=STR 

specifies which part of the GO to use, ie F = molecular function, C = cellular component, P = biological process, default is P only but can specify all by using "F,C,P"

--minr=FLOAT 

specifies how independent the categories should be, set to 0 by default, meaning all informative categories/pathways will be reported. --minr=5 will remove a lot of redundant categories.


  • Results

PAGE creates an expfile_PAGE directory and put the results in it. The pdf file is the main result file. pvmatrix.txt contains the data used to draw the pdf.


  • Additional options

To restrict the number of evaluated pathways to a list (1 line per pathway)

--pathwaylist=FILE

The number of bins for continuous expression values is set to #genes/100 (100 genes per bin). But that can be changed using

--ebins=INT

To estimate the number of false positive pathways in your list

--randomize=INT 

where INT specifies how many random runs to execute (do 3 minimum)

By default PAGE overwrites everything in $expfile_PAGE/. However, you may want to run PAGE for different annotation categories (e.g. KEGG, BioCarta). To change the name of your output file, use the -suffix option. So you can do something like:

page.pl --expfile=FILE --pathways=human_go_orf -suffix=GO
page.pl --expfile=FILE --pathways=kegg -suffix=KEGG
  • get list of genes in bin/cluster that belong to a given pathway (after PAGE analysis)
find_genes_in_bin_and_pathway.pl --expfile=FILE --bin=INT --pathway=STR --species=STR

# example:
find_genes_in_bin_and_pathway.pl --expfile=yourexpfile.txt --bin=1 --pathway=hsa04662 --species=kegg
Personal tools