Elementolab/ChIPseeqer QC

From Icbwiki

Jump to: navigation, search

Back to Elementolab/ChIPseeqer_Tutorial

Contents

QC Analysis

This step provides fast QC analysis for the ChIP-seq reads, in terms of:

  • 1. Coverage
  • 2. Clonal reads
  • 3. Reads that were uniquely mapped to the genome.
  • 4. GC-rich content

Coverage analysis

This analysis computes the distribution of read coverages genome-wide. It is useful to get an idea about the presence or absence of high-coverage regions in a ChIP-seq experiment.

ChIPseeqerReadCountDistribution -chipdir ChIP/ -fraglen 0 -chrdata $CHIPSEEQERDIR/DATA/hg18.chrdata -uniquereads 1 -normalize 0 -format sam

Output:

COVERAGE	READS	%READS	NUCLEOTIDES
0X	2620784396	85.08%	94348238256
1X	390650661	12.68%	14063423796
2X	53345778	1.73%	1920448008
3X	9240729	0.30%	332666244
4X	2779890	0.09%	100076040
5X	1255976	0.04%	45215136
6X	693433	0.02%	24963588
7X	426743	0.01%	15362748
8X	284987	0.01%	10259532
9X	199419	0.01%	7179084
10X	147840	0.00%	5322240
>10X	626199	0.02%	22543164

Clonal reads

This analysis computes the percentage of clonal (duplicate) reads.

ChIPseeqerGetNumClonalReads -chipdir ChIP/ -format sam -chrdata DATA/hg18.chrdata

Output:

clonal fraction removed =  8.9% (1537250/17257489)

Unique reads and GC content

1. Go to the ChIPseeqer directory.

cd ChIPseeqer

2. Type the command:

./ChIPseeqerQC --files="*.gz" [or --datafolder=FOLDER] --format=sam --qcType=all

The following options are available:

--format	can be sam, eland or exteland
--files=FILES	specifies files to process e.g. --files=\"*.gz\"
--datafolder=DIR	point to a directory with files to process
--qcType=STR	can be all, showUnique, showGCRich
Personal tools