Elementolab/ChIPseeqerModel
From Icbwiki
Requires: ChIPseeqer (with PCA/GSL), Heatmap perl module, R.
1. Apply CS to get peak files
ChIPseeqer.bin -chipdir H3K9Ac/CHIP/ -inputdir H3K9Ac/INPUT/ -outfile h3k9Ac_peaks.txt ChIPseeqer.bin -chipdir MTA3/CHIP/ -inputdir MTA3/INPUT/ -outfile MTA3_peaks.txt ...
2. Get promoter binding scores
perl $CHIPSEEQERDIR/SCRIPTS/CalcExtendedPeakScores.pl --peakfile=h3k9Ac_peaks.txt --d0=5000 --label=H3K9Ac > h3k9Ac_peaks.txt.PB perl $CHIPSEEQERDIR/SCRIPTS/CalcExtendedPeakScores.pl --peakfile=MTA3_peaks.txt --d0=5000 --label=MTA3 > MTA3_peaks.txt.PB ...
3. Combine multiple promoter scores
expression_concatenate_matrices.pl */*.PB > MatPromBind.txt
4. PCA
ChIP_PCA -chipdata MatPromBind.txt -center 0 -scale 1
(NOTES: gsl library is needed
ChIP_PCA is in BCELLS)
This will create 3 files, ending in .evec, .eval and .proj
5. visualize eigenvectors and eigenvalues
draw_expression_heatmap.pl --matrix=MatPromBind.txt.evec --minmax=0.25 --clustcols=1 \ --distance=euclidean --cmap=/home/ole2001/PROGRAMS/HEATMAP/colormaps/cmap2.txt plotEval.pl MatPromBind.txt.eval
6. Generate RNA-seq RPKM
Simple format: Transcript_ID RPKM_value
ID LY1 LY7 CB NB NM_021648 13.6 14.4 14.9 10.5 NM_020351 0.0 0.0 0.0 0.0 NM_001112734 1.4 1.6 3.6 3.4 NM_002697 21.9 28.1 12.7 7.0 NM_002656 0.1 0.0 0.7 0.8
7. Linear regression
perl ~/PROGRAMS/BCELLS/CorrelatePCADataAndExpression.pl --evec=PromBindMatAllNOH_DNAMETHnomissing.txt.proj --expfile=/panda_scratch_miro/ole2001/MELNICKRNASEQ/LY1_LY7_CB_NB_RPKM.txt --pred="log(LY1+1)"
