Elementolab/RRBSseeqer Tutorial

From Icbwiki

Revision as of 23:39, 22 July 2012; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

Elementolab/ Elementolab/RRBSseeqer/


Contents

Installation

This tutorial assumes that you are in the RRBseeqer main directory (cd RRBS/). An even better way to proceed is to create an environment variable called RRBSEEQERDIR

export RRBSEEQERDIR=/path/to/RRBS  # REPLACE /path/to/RBBS by the REAL path

and then add RRBSEEQERDIR to your path:

export PATH=RRBSEEQERDIR:$PATH

Ideally, add the above to your .bashrc file so you don't have to retype it everytime you log in.

Transform epicore methycall files into RRBSeeqer input files (.calls)

perl epicore2calls.pl methylCall_Sample_YYY.txt > methylCall_Sample_YYY.txt.calls

Optionally you can gzip the call files as most other RRBSseeqer programs can read from gzipped files):

perl epicore2calls.pl methylCall_Sample_YYY.txt | gzip > methylCall_Sample_YYY.txt.calls.gz

Detect differentially methylated CGs

./RRBSseeqer_CG -rrbs1 sample.calls.gz -rrbs2 sample_control.calls.gz | sort_column.pl 1 | sort_column_alpnum.pl 0 > cgs.txt

Options:

-rrbs1		FILE (condition 1)
-rrbs2		FILE (condition 2)
-test		[fisher|chi] (Statistical test to be performed)
-minfold	FLOAT (Def: 1.25)
-fdr		FLOAT (Def: 0.2)

Detect differentially methylated regions (DMR)

perl RRBSidentifyUpDownDMR.pl --metfile=cgs.txt --dmax=250 --minmetdx=0.1 --minnumcg=5 -outfile=cgs_DMR.txt

Input must come from RRBSeeqer_CG. Parameters indicate maximum distance between diff methylated CGs (--dmax), minimum number of differentially methylated CGs for a region to be reported (--minumcg) and min total methylation dx for a region (--minmetdx), with 10% used default.

To create a bed file with all DMRs

perl DMR2bed.pl cgs_DMR.txt > cgs_DMR.bed


To create a bed file that can be uploaded to the Genome Browser, with two different tracks for hyper and hypo DMRs

perl DMR2bed2.pl cgs_DMR.txt

Annotate CGs

With CpG islands

./RRBSannotate -peakfile1 cgs.txt -peakfile2 DATA/hg19/CpGislands_annotation -showpeakdesc 1 -hasid1 0 -outfile cgs_CGI.txt

With RRBSannotate, you can just add annotation columns. Here's how to add shore annotation

./RRBSannotate -peakfile1 cgs_CGI.txt -peakfile2 DATA/hg19/CpGislands_annotation -shores 1 -ext2 1000  -hasid2 0  -showpeakdesc 1 -hasid1 0 -outfile cgs_CGI_shores.txt


With genes

./RRBSannotateGenes -rrbsfile cgs.txt -annotation DATA/hg19/refSeq -chrdata DATA/hg19.chrdata 

(you need to download gene annotation files first (e.g., hg18.tar.gz, hg19.tar.gz) from --> http://physiology.med.cornell.edu/faculty/elemento/lab/CS_files/)

Draw methylation across chromosomes

perl DrawRRBSmap2.pl --chrdata=DATA/hg19.chrdata --rrbsdata=cgs.txt

Make BigWig DIFFERENTIAL methylation track

Let's assume ben1_pca1.txt contains a differential methylation analysis generated by RRBSseeqer_CG above. To generate a BigWig track, make sure wigToBigWig is installed, and type something like this

RRBSmakeMethylDiffTrack -methfile ben1_pca1.txt -desc ben1_pca1 -type diff -bigwig 1 -chrdata hg19.chrdata -chromsizes hg19.chroms.sizes -s 25

This command will create a .bw track (binary format), with window size 50bp (2x25) that you can upload to your web server, then make visible to the UCSC browser using:

track type=bigWig name="ben1_pca1" description="ben1_pca1" bigDataUrl=http://physiology.med.cornell.edu/faculty/elemento/lab/files/ben1_pca1.wig.bw \ 
  visibility="full" maxHeightPixels="64:64:11" smoothingWindow="10" viewLimits="0:100" autoScale="off"

A sometimes faster way to visualize a BigWig track is to load it into IGV http://www.broadinstitute.org/igv/

Methylation levels for annotated elements

getMethylationLevelsForAnnotatedElements.pl --callfile=CpG_context_Sample_ln_cap_r.fastq.filtered.noadapt.gz_bismark.txt.calls_NM --headerhasid=0 \
--promlist=refSeq.new.u2000_d2000.PROMOTERS --minnumcg=2 --col=5

Methylation levels from scratch

 ./RRBSfindMethylationLevelForIntervals.pl --callfile=eRRBS_CB012511_50ng_myCpG.txt.calls.gz \ 
--regions=merged016_trim_CB_rep1_rep2_inter_100k.wig --hasid=1

(use hasid=1 if your interval file has identifiers eg for promoters, 0 otherwise eg for CpF islands)

It also works for differential analysis (output of RRBseeqer_CG)

./RRBSfindMethylationLevelForIntervals.pl --regions=refGene.txt.17Jan2012.oneperTSS.u2000_d2000 \
--callfile=empty_vs_sh2_21d.txt.RGYW --hasid=1 --diff=1 > empty_vs_sh2_21d.txt.RGYW.prom.diffmet
Personal tools