Elementolab/ChIPseeqerDensityMatrix

From Icbwiki

(Difference between revisions)
Jump to: navigation, search
Revision as of 16:57, 4 May 2011
Eug2002 (Talk | contribs)

← Previous diff
Revision as of 18:40, 21 June 2012
Ole2001 (Talk | contribs)

Next diff →
Line 14: Line 14:
ChIPseeqerDensityMatrix --targets='''TF_targets.txt''' --chipdir='''CHIP''' ChIPseeqerDensityMatrix --targets='''TF_targets.txt''' --chipdir='''CHIP'''
 +
 +For example:
 +
 + ChIPseeqerDensityMatrix --genome=mm9 --chipdir=CHIP/ --generegion=TSS --db=refSeq --format=sam --targets=peaks.txt
The following options are available: The following options are available:
- --targets=FILE file containing genomic regions+ --targets=FILE file containing genomic regions (or "all" but only in latest svn version)
--chipdir=DIR folder containing the reads --chipdir=DIR folder containing the reads
--norm=INT set to 1 to normalize the matrix --norm=INT set to 1 to normalize the matrix

Revision as of 18:40, 21 June 2012

Back to Elementolab/ChIPseeqer_Tutorial

ChIPseeqerDensityMatrix

In this analysis you can estimate the average read density profiles:

  • for the regions around the TSS or the TES of the genes, OR
  • for the regions around the summit of the peaks.

To run the tools directly from any folder, you need to add the $CHIPSEEQERDIR and $CHIPSEEQERDIR/SCRIPTS to your $PATH variable. Read How to set the CHIPSEEQERDIR variable.

1. Type the command:

ChIPseeqerDensityMatrix --targets=TF_targets.txt  --chipdir=CHIP

For example:

ChIPseeqerDensityMatrix --genome=mm9 --chipdir=CHIP/ --generegion=TSS --db=refSeq --format=sam --targets=peaks.txt

The following options are available:

--targets=FILE    file containing genomic regions (or "all" but only in latest svn version)
--chipdir=DIR     folder containing the reads
--norm=INT        set to 1 to normalize the matrix
--prefix=STR      prefix for output files
--format=STR      the format of the reads (e.g., eland, bed)
--fraglen=INT     0 = no read extension otherwise extend to specified value 
--uniquereads=INT 0/1 if 1 collapose clonal reads; 1 = recommended for TF and histone modification, not for nucleosome positioning 
--ws=INT          the window size, can be 10, 100 etc. Default is 10.
--outepsmap=FILE  the name of the output eps file with the 2D density plot
--xlabel=STR      the label for the x axis of the 2D plot
--ylabel=STR      the label for the y axis of the 2D plot
  • To estimate the read density profiles for the regions around the TSS or the TES of the genes, use the following options:
--generegion=STR can be TSS (transcription start site) or TES (transcription end site)
--lenu=INT       length upstream of genomic region (TSS or TES). Default is 2000bp.
--lend=INT       length downstream of genomic region (TSS or TES). Default is 2000bp.
--genome=STR       can be hg18 (human),
                   mm9 (mouse),
                   dm3 (drosophila), or
                   sacser (for Saccharomyces cerevisiae)
--db=STR           can be refSeq (available for hg18, mm9, dm3), 
                   AceView (for hg18, mm9), 
                   Ensembl (for hg18, mm9, dm3)
                   UCSCGenes (for hg18, mm9). 
                   Default is refSeq.
  • To estimate the read density profiles for the regions around the summit of the peaks, use the following options:
--w=INT          the window size around the peak summit. Default is 2000bp.

This analysis was also described here.


IMPORTANT: Note that in the --targets option you must enter the ChIPseeqer output file.

The output of this process is a .density file. Each line corresponds to a RefSeq transcript (when --generegion=TSS or TES). For each 10 nucleotides in the region 2000b upstream to 2000b downstream, the average number of reads is computed. Thus, the .density file will look like this:

NM_007125	3.8	4.0	4.0	3.8	3.0	3.0	2.6	2.0	2.0	2.0	2.0	2.0	2.0	2.0	2.0	2.0	2.0
NM_004202	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.6	1.0	1.0	1.0	1.0	1.0
NM_001005852	3.0	3.0	3.0	3.0	3.0	3.0	3.0	3.0	3.0	3.0	3.0	3.0	3.0	3.0	3.0	2.6	1.0   
NM_001146706	4.0	4.0	4.0	4.0	4.0	3.8	3.0	3.0	3.0	3.0	3.0	4.0	4.0	4.0	4.0	2.0	2.0	

The number of columns per line will be (lend+lenu)/10 + 1

Personal tools