Elementolab/ChIPseeqer Species

From Icbwiki

Jump to: navigation, search

Elementolab/

SUPPORTED SPECIES

The species currenty supported in ChIPseeqer are:

  • Human (acembly hg18) - Default in all ChIPseeqer programs
  • Mus musculus (acembly mm9)
  • Drosophila melanogaster (acembly dm3)
  • Saccharomyces cerevisiae

If your favorite organism is not in the list and you would like us to add it, please contact us.

To run ChIPseeqer on a species make sure to use the option:

-chrdata STR     to run for organisms other that human (default is hg18), point to files:
                 DATA/mm9.chrdata for mouse, 
                 DATA/dm3.chrdata for drosophila or 
                 DATA/sacser.chrdata for Saccharomyces cerevisiae

Other programs in the ChIPseeqer framework that you need to specify

* the genome (e.g., --genome=hg18) or
* the database (e.g., --db=AceView) or
* the gene annotation files (e.g., annotation=DATA/mm9/refGene.txt.mm9.20APR2010)

are:

ChIPseeqerAnnotate
ChIPseeqerSummaryPromoters
ChIPseeqerFindClosestGenes
ChIPseeqerFindDistalPeaks
ChIPseeqerDensityMatrix
ChIPseeqerPlotAverageReadDensityInGenes
ChIPseeqerPlotAveragePeaksNumberInGenes
ChIPseeqerCreateRandomRegions

HOW TO ADD A NEW SPECIES

In order to add a new species we need to:

1. Estimate the mappability of the genome and produce a .chrdata file for the new organism.

  • The .chrdata file for the new organism, will look like this:
chr10	129993255	0.840
chr11	121843856	0.866
chr12	121257530	0.797
chr13	120284312	0.816
chr14	125194864	0.781
chr15	103494974	0.842
chr16	98319150	0.838
chr17	95272651	0.812
chr18	90772031	0.837
chr19	61342430	0.830

The columns indicate (chromosome_name chromosome_size chromosome_mappability)

  • To create this file we follow the steps here.

IMPORTANT: If the name of the species is x, make sure that you name the file x.chrdata

IMPORTANT: Put the x.chrdata file in in $CHIPSEEQERDIR/DATA

2. Download the gene annotation files

  • Add a new folder for the new species x in $CHIPSEEQERDIR/DATA
cd $CHIPSEEQERDIR/DATA
mkdir x
  • Download the gene annotation file (e.g., from the UCSC Table browser) in the new folder
  • Run the script:
make_gene_annotation_files.pl --annotation=$CHIPSEEQERDIR/DATA/x/file

all the files needed from the ChIPseeqer framework will be created.

IMPORTANT: If the database is refSeq, Ensembl, AceView, or UCSCGenes, make sure to rename all annotation files created, so that they look like this:

refSeq
refSeq.EXONS
refSeq.GENEPARTS
refSeq.INTRONS
refSeq.new
refSeq.NM2ORF
refSeq.oneperTSS
refSeq.TSS_TES
  • The ChIPseeqer scripts will automatically use the new species files, given that you specify:
--genome=x
--db=refSeq (or AceView, Ensembl, UCSCGenes)
Personal tools