PAGE Tutorial
From Icbwiki
| Revision as of 16:11, 17 August 2010 Tas2019 (Talk | contribs) ← Previous diff |
Revision as of 21:08, 29 May 2012 Ole2001 (Talk | contribs) Next diff → |
||
| Line 19: | Line 19: | ||
| export PAGEDIR=`pwd` (if that does not work try: setenv PAGEDIR `pwd`) | export PAGEDIR=`pwd` (if that does not work try: setenv PAGEDIR `pwd`) | ||
| export PATH=$PATH:$PAGEDIR (here too, try setenv PATH $PATH:$PAGEDIR if export failed) | export PATH=$PATH:$PAGEDIR (here too, try setenv PATH $PATH:$PAGEDIR if export failed) | ||
| + | |||
| + | |||
| * '''Running''' | * '''Running''' | ||
Revision as of 21:08, 29 May 2012
- get PAGE from svn
From our subversion repository:
svn co --username=guest --password=email@email.com https://pbtech-vc.med.cornell.edu/public/svn/elementolab/PAGE/trunk PAGE/
(Elemento lab members should use their PBtech identifiers instead of the guest ones)
- Or download it (less up-to-date version)
http://physiology.med.cornell.edu/faculty/elemento/lab/files/PAGE.zip unzip PAGE.zip
- To install
cd PAGE make clean make export PAGEDIR=`pwd` (if that does not work try: setenv PAGEDIR `pwd`) export PATH=$PATH:$PAGEDIR (here too, try setenv PATH $PATH:$PAGEDIR if export failed)
- Running
page.pl --expfile=FILE --pathways=STR --exptype=[discrete|continuous] [ --cattypes=P --minr=0.0 ]
where
--expfile=FILE
is an expression/profile file, same format as in FIRE, ie two columns, one for the gene name, one for the expression measures. Here is an example of a [discrete expression profile], here is a [continuous one]
--pathways=STR
is species+annotation, most likely human_go_orf (uses gene names like TP53, and curated GO categories, no electronic annotation).
Additional sources of annotations:
--pathways=human_go_orf (all GO categories, including electronic annotation) --pathways=biocarta --pathways=kegg --pathways=HPRD_interactions --pathways=staudt_genesets (curated B-cell-related gene sets from Lou Staudt's lab)
--exptype=STR
describes whether the expression profile is discrete or continuous.
--cattypes=STR
specifies which part of the GO to use, ie F = molecular function, C = cellular component, P = biological process, default is P only but can specify all by using "F,C,P"
--minr=FLOAT
specifies how independent the categories should be, set to 0 by default, meaning all informative categories/pathways will be reported. --minr=5 will remove a lot of redundant categories.
- Results
PAGE creates an expfile_PAGE directory and put the results in it. The pdf file is the main result file. pvmatrix.txt contains the data used to draw the pdf.
- Additional options
To restrict the number of evaluated pathways to a list (1 line per pathway)
--pathwaylist=FILE
The number of bins for continuous expression values is set to #genes/100 (100 genes per bin). But that can be changed using
--ebins=INT
To estimate the number of false positive pathways in your list
--randomize=INT
where INT specifies how many random runs to execute (do 3 minimum)
By default PAGE overwrites everything in $expfile_PAGE/. However, you may want to run PAGE for different annotation categories (e.g. KEGG, BioCarta). To change the name of your output file, use the -suffix option. So you can do something like:
page.pl --expfile=FILE --pathways=human_go_orf -suffix=GO page.pl --expfile=FILE --pathways=kegg -suffix=KEGG
- get list of genes in bin/cluster that belong to a given pathway (after PAGE analysis)
find_genes_in_bin_and_pathway.pl --expfile=FILE --bin=INT --pathway=STR --species=STR # example: find_genes_in_bin_and_pathway.pl --expfile=yourexpfile.txt --bin=1 --pathway=hsa04662 --species=kegg
