Elementolab/ChIPseeqer Species
From Icbwiki
SUPPORTED SPECIES
The species currenty supported in ChIPseeqer are:
- Human (acembly hg18) - Default in all ChIPseeqer programs
- Mus musculus (acembly mm9)
- Drosophila melanogaster (acembly dm3)
- Saccharomyces cerevisiae
If your favorite organism is not in the list and you would like us to add it, please contact us.
To run ChIPseeqer on a species make sure to use the option:
-chrdata STR to run for organisms other that human (default is hg18), point to files:
DATA/mm9.chrdata for mouse,
DATA/dm3.chrdata for drosophila or
DATA/sacser.chrdata for Saccharomyces cerevisiae
Other programs in the ChIPseeqer framework that you need to specify
* the genome (e.g., --genome=hg18) or * the database (e.g., --db=AceView) or * the gene annotation files (e.g., annotation=DATA/mm9/refGene.txt.mm9.20APR2010)
are:
ChIPseeqerAnnotate ChIPseeqerSummaryPromoters ChIPseeqerFindClosestGenes ChIPseeqerFindDistalPeaks ChIPseeqerDensityMatrix ChIPseeqerPlotAverageReadDensityInGenes ChIPseeqerPlotAveragePeaksNumberInGenes ChIPseeqerCreateRandomRegions
HOW TO ADD A NEW SPECIES
In order to add a new species we need to:
1. Estimate the mappability of the genome and produce a .chrdata file for the new organism.
- The .chrdata file for the new organism, will look like this:
chr10 129993255 0.840 chr11 121843856 0.866 chr12 121257530 0.797 chr13 120284312 0.816 chr14 125194864 0.781 chr15 103494974 0.842 chr16 98319150 0.838 chr17 95272651 0.812 chr18 90772031 0.837 chr19 61342430 0.830
The columns indicate (chromosome_name chromosome_size chromosome_mappability)
- To create this file we follow the steps here.
- Once we create the .chrdata file,
- we add it to the ChIPseeqer/DATA/ folder
- we update the ChIPseeqer/CommonPaths.pm file
our $HG18_CHRDATA = "$ENV{CHIPSEEQERDIR}/DATA/hg18.chrdata";
our $MM9_CHRDATA = "$ENV{CHIPSEEQERDIR}/DATA/mm9.chrdata";
our $DM3_CHRDATA = "$ENV{CHIPSEEQERDIR}/DATA/dm3.chrdata";
our $SACCER_CHRDATA = "$ENV{CHIPSEEQERDIR}/DATA/sacser.chrdata";
our $NEW_CHRDATA = "$ENV{CHIPSEEQERDIR}/DATA/sacser.chrdata";
2. Download the gene annotation files
- Add a new folder for the new species in $CHIPSEEQERDIR/DATA
cd $CHIPSEEQERDIR/DATA mkdir new_species
- Download the gene annotation file (e.g., from the UCSC Table browser) in the new folder
- Run the script:
make_gene_annotation_files.pl --annotation=$CHIPSEEQERDIR/DATA/new_species/file
all the files needed from the ChIPseeqer framework will be created.
- We need to update the ChIPseeqer/CommonPaths.pm file and all ChIPseeqer scripts that need to use the annotation files.
ChIPseeqerAnnotate ChIPseeqerSummaryPromoters ChIPseeqerFindClosestGenes ChIPseeqerFindDistalPeaks ChIPseeqerDensityMatrix ChIPseeqerPlotAverageReadDensityInGenes ChIPseeqerPlotAveragePeaksNumberInGenes ChIPseeqerCreateRandomRegions
