Elementolab/ChIPseeqerReadCountMatrix

From Icbwiki

(Difference between revisions)
Jump to: navigation, search
Revision as of 16:23, 29 January 2011
Eug2002 (Talk | contribs)

← Previous diff
Revision as of 16:23, 29 January 2011
Eug2002 (Talk | contribs)

Next diff →
Line 27: Line 27:
Available options are: Available options are:
- --intervals FILE file containing the peaks + --peakfile FILE file containing the peaks
--chipdirfile FILE file containing the paths to the ChIP reads folders --chipdirfile FILE file containing the paths to the ChIP reads folders
--chrdata STR to run for organisms other that human (default is hg18), point to files: --chrdata STR to run for organisms other that human (default is hg18), point to files:

Revision as of 16:23, 29 January 2011

Back to Elementolab/ChIPseeqer_Tutorial

ChIPseeqerGetReadCountInPeaksMatrix

To run the tools directly from any folder, you need to add the $CHIPSEEQERDIR and $CHIPSEEQERDIR/SCRIPTS to your $PATH variable. Read How to set the CHIPSEEQERDIR variable.

In this analysis you can get the average or maximum reads count for each peak, across multiple ChIP-seq datasets.

The input of this script are:

  • a file with peaks (option: peakfile) (i.e., typical output of ChIPseeqer)
  • a file (option: chipdirfile) with a label of the CHIP-seq dataset and the whole path to the directory with the splitted reads (i.e., after running ChIPseeqerSplitReadFiles)

Example of the chipdirfile.

BCL6	/DATASETS/BCL6/CHIP
Pol2_2P	/DATASETS/POL2_2P/CHIP/
Pol2_5P	/DATASETS/POL2_5P/CHIP
BCOR	/DATASETS/BCOR/CHIP/
PAX5	/DATASETS/PAX5/CHIP/

1. To run the script type the command:

ChIPseeqerGetReadCountInPeaksMatrix --peakfile=peaks.txt --chipdirfile=chipdirfile.txt --chrdata=DATA/hg18.chrdata --outfile=matrix.txt

Available options are:

--peakfile FILE   file containing the peaks 
--chipdirfile FILE file containing the paths to the ChIP reads folders 
--chrdata STR      to run for organisms other that human (default is hg18), point to files:
                  DATA/mm9.chrdata for mouse, 
                  DATA/dm3.chrdata for drosophila or 
                  DATA/sacser.chrdata for Saccharomyces cerevisiae
--outfile STR      indicates the output file (provide full path)
--format STR       format of the read files: sam, mit, bed, eland. Default is eland.

2. See the results.

The result will be a peaks-based matrix. Each column contains the max reads count for the corresponding ChIP-seq dataset (indicated by the column name).

For example,

	                Pol2_2P	Pol2_5P	BCL6	BCOR	PAX5
chrY-57398943-57399486	 2.406	1.857	8.112	1.832	1.119
chr1-2211998-2212341	 0.344	0.309	23.178	0.234	0.000
chrY-57412321-57413102	 1.375	0.928	5.215	1.234	0.000
chrX-13074532-13074581	 0.344	0.619	18.543	0.198	0.000
chrY-57415332-57415537	 0.344	0.309	3.477	3.546	0.000
chrX-12159949-12159898	 0.000	0.309	4.056	4.098	1.789
chrX-12105587-12105784	 0.000	0.309	6.954	1.987	4.327
Personal tools