Elementolab/ChIPseeqer FAQ

From Icbwiki

Jump to: navigation, search


ChIPseeqer Frequently Asked Questions

  1. How can I install the software?
    See Installation Instructions page.
  2. I have some problems installing some of the packages needed from macports
    See here for known issues on Mac OS X installation.
  3. How long will it take to run the analysis?
    Check out this page
  4. I have a text file with genomic coordinates, the format is "chromosome start end". When I use it in any of the ChIPseeqer tools, I get no results. What is the problem?
    You can use any file with genomic coordinates in ChIPseeqer, even if this file comes from another peak detection program.
    However, it is likely that you will get no results (or errors), if you have processed the file in Excel.
    In Excel, the carriage return character is used in the end of each line (\r), which is not recognized by Unix-based systems.
    You can simply replace the carriage return (\r) with the newline character (\n) and run the transformed file in ChIPseeqer.
    Here's how you can do it in Perl: perl -pi -e 's/\r/\n/g' file.txt
    Here are other ways to do it: How do I convert between Unix and Windows text files?
  5. I run a tool from ChIPseeqer and I get the error "No such file or directory", but I am sure the file exists. Why do I get this error?
    It is possible that the path you used to specify the file or folder in the input command contains spaces.
    For example: ChIPseeqer.bin -chipdir MY CHIP FOLDER/ -inputdir MY INPUT FOLDER/ -t 15 -fold_t 2 -format eland -outfile TF_targets.txt
    Get rid of the spaces (e.g., you can replace them with "_") and run the script again.
    For example: ChIPseeqer.bin -chipdir MY_CHIP_FOLDER/ -inputdir MYINPUTFOLDER/ -t 15 -fold_t 2 -format eland -outfile TF_targets.txt
  6. I run the ChIPseeqer.bin program for peak detection and although I get no errors, I get "0 peaks detected so far". What am I doing wrong?
    It is possible that you have not selected the correct format for your CHIP/INPUT reads.
    For example: ChIPseeqer.bin -chipdir MYCHIPFOLDER/ -inputdir MYINPUTFOLDER/ -t 15 -fold_t 2 -format sam -outfile TF_targets.txt
    If you omit the -format option, the expected format is eland.
    See the supported formats here. If the format you want to use is not supported yet contact us.
  7. Should I change the default parameters of ChIPseeqer in order to detect broad domain histone modifications?
    Yes, in order to capture peaks of broad domain histone modifications (e.g., H3K36me3, H3K79me2) we suggest tuning the ChIPseeqer parameters:
    -t [significance negative log p-value [ratio] threshold for peaks. Thus, 15 means 10^-15. Default is 15.]
    -mindist [mininum distance between peaks (merge subpeaks otherwise). Default is 100bp.]
    Using lower t threshold (such as t=5) allows including peaks that are not very “sharp”
    Increasing mindist value (such as 1000, 10000), the minimum distance between the peaks, allows merging continuous enriched regions into a large peak
  8. I don't understand the columns of the ChIPseeqer peak detection output.
    The output columns are explained here.
  9. How do I use ChIPseeqer (for peak finding) on non-human data ?
    Simply use the -chrdata option, and point it to either of these files
    $CHIPSEEQERDIR/DATA/mm9.chrdata # (for mouse; make sure that the reads were mapped to mm9)
    $CHIPSEEQERDIR/DATA/dm3.chrdata # Drosophila melanogaster
    $CHIPSEEQERDIR/DATA/sacser.chrdata # yeast
    In other programs, such as ChIPseeqerAnnotate, use the --genome option to specify non-hg18 species, eg --genome=mm9
  10. How is the statistical significance of peaks overlap assessed?
    To assess the statistical significance of the estimated overlap x between two peak files, several background peak files are first created (be default 1,000), that have the same size, and the same genomic distribution (i.e., the percentage of peaks in promoters, exons, intergenic etc) with the first peak file of the comparison.
    Then, the overlapping comparison between each and every one of the background files with the second peak file is performed.
    Finally, the z-score is estimated, showing the distance of x from the mean of overlaps coming from the background files.
    This process is performed in ChIPseeqerCompareIntervals (add link).
  11. How can I set an enviromental variable?
    See instructions here.
  12. How can I cite ChIPseeqer?
    If you use ChIPseeqer in your research, please cite the following paper:
    An integrated ChIP-seq analysis platform with customizable workflows, Eugenia G Giannopoulou, Olivier Elemento, BMC Bioinformatics 2011, 12:277
  13. Who should I contact for a question I have about ChIPseeqer?
    You can contact:
    Jenny Giannopoulou or
    Olivier Elemento
Personal tools