Elementolab/ChIPseeqerModel

From Icbwiki

(Difference between revisions)
Jump to: navigation, search
Revision as of 20:52, 26 October 2010
Eug2002 (Talk | contribs)

← Previous diff
Revision as of 17:38, 27 October 2010
Eug2002 (Talk | contribs)

Next diff →
Line 35: Line 35:
6. Generate RNA-seq RPKM 6. Generate RNA-seq RPKM
 +Simple format: Transcript_ID RPKM_value
 +
 + ID LY1 LY7 CB NB
 + NM_021648 13.6 14.4 14.9 10.5
 + NM_020351 0.0 0.0 0.0 0.0
 + NM_001112734 1.4 1.6 3.6 3.4
 + NM_002697 21.9 28.1 12.7 7.0
 + NM_002656 0.1 0.0 0.7 0.8
7. Linear regression 7. Linear regression
perl ~/PROGRAMS/BCELLS/CorrelatePCADataAndExpression.pl --evec=PromBindMatAllNOH_DNAMETHnomissing.txt.proj --expfile=/panda_scratch_miro/ole2001/MELNICKRNASEQ/LY1_LY7_CB_NB_RPKM.txt --pred="log(LY1+1)" perl ~/PROGRAMS/BCELLS/CorrelatePCADataAndExpression.pl --evec=PromBindMatAllNOH_DNAMETHnomissing.txt.proj --expfile=/panda_scratch_miro/ole2001/MELNICKRNASEQ/LY1_LY7_CB_NB_RPKM.txt --pred="log(LY1+1)"

Revision as of 17:38, 27 October 2010

Requires: ChIPseeqer (with PCA/GSL), Heatmap perl module, R.

1. Apply CS to get peak files

ChIPseeqer.bin -chipdir H3K9Ac/CHIP/ -inputdir H3K9Ac/INPUT/ -outfile h3k9Ac_peaks.txt
ChIPseeqer.bin -chipdir MTA3/CHIP/ -inputdir MTA3/INPUT/ -outfile MTA3_peaks.txt
...

2. Get promoter binding scores

perl $CHIPSEEQERDIR/SCRIPTS/CalcExtendedPeakScores.pl --peakfile=h3k9Ac_peaks.txt --d0=5000 --label=H3K9Ac > h3k9Ac_peaks.txt.PB
perl $CHIPSEEQERDIR/SCRIPTS/CalcExtendedPeakScores.pl --peakfile=MTA3_peaks.txt --d0=5000 --label=MTA3 > MTA3_peaks.txt.PB
...

3. Combine multiple promoter scores

expression_concatenate_matrices.pl */*.PB > MatPromBind.txt

4. PCA

ChIP_PCA -chipdata MatPromBind.txt -center 0 -scale 1

(NOTES: gsl library is needed

ChIP_PCA is in BCELLS)

This will create 3 files, ending in .evec, .eval and .proj

5. visualize eigenvectors and eigenvalues

draw_expression_heatmap.pl --matrix=MatPromBind.txt.evec --minmax=0.25 --clustcols=1 \
  --distance=euclidean --cmap=/home/ole2001/PROGRAMS/HEATMAP/colormaps/cmap2.txt
plotEval.pl MatPromBind.txt.eval 

6. Generate RNA-seq RPKM

Simple format: Transcript_ID RPKM_value

ID	LY1	LY7	CB	NB
NM_021648	13.6	14.4	14.9	10.5
NM_020351	0.0	0.0	0.0	0.0
NM_001112734	1.4	1.6	3.6	3.4
NM_002697	21.9	28.1	12.7	7.0
NM_002656	0.1	0.0	0.7	0.8

7. Linear regression

perl ~/PROGRAMS/BCELLS/CorrelatePCADataAndExpression.pl --evec=PromBindMatAllNOH_DNAMETHnomissing.txt.proj --expfile=/panda_scratch_miro/ole2001/MELNICKRNASEQ/LY1_LY7_CB_NB_RPKM.txt --pred="log(LY1+1)"
Personal tools