Elementolab/ChIPseeqer TODO

From Icbwiki

Jump to: navigation, search

TODO

  • CpG/Repeats etc for other species (only hg18 supported now)
  • Support hg19 in GUI
  • Fix binomial probabilities in CSAnnotate
  • Make prioritized gene annotation files for dm3
  • CS2FIRE/J-FIRE: need to make J-FIRE available asap: CS2FIRE could have a useJFIRE=1 option;
  • remove all requirements to use R anywhere in the framework

DONE

  • CS2FIRE: --genome option inadequate ... many people will use --genome=mm9 and it would be treated as Drosophila genome
  • create default distal category in CSA
  • clean up QC, add clonal read detection program (or keep separate but remove the Perl code)
  • move QC to main dir and rename to ChIPseeqerQC
  • CSMotifMatch, input peakfile, runs CSAnnotate, combines peaks-genes in geneparts (P,I,E) and looks for pathways using the GO name (GO:000977) or any name included in a pathway (e.g., apoptosis, regulation)
  • Print summary of parameters used in ChIPseeqer
  • Kohonen support with visualization - and merge with CSCluster
  • QC: need to rewrite clonal read analysis in C .. Perl way too slow ... generally add more QC tool like in HOMER
  • need to add BAM support
  • show actual output in http://icb.med.cornell.edu/wiki/index.php/Elementolab/ChIPseeqerGetReadCountInPeakRegions
  • A tool like CSMotifMatch but for pathways - no enrichment involved (CSPathwayMatch)
  • CSPAGE: improve transition and doc between CSAnnotate and CSPAGE. Should have a tool to merge columns of the main CSA output matrix
  • make tool similar to http://icb.med.cornell.edu/wiki/index.php/Elementolab/ChIPseeqerPlotAverageReadDensityInGenes that shows where peaks (not reads) fall in promoters, exons, etc
  • add support for ENSEMBL/UCSCGenes annotation
  • create documentation for CSgetIntervalReadCounts (rename too) ... let's show in the documentation what the input is like
  • add statistics (avg peak length, etc) to CS
  • add illustrative fig to http://icb.med.cornell.edu/wiki/index.php/Elementolab/ChIPseeqerGetReadDensityProfiles
  • fix (prog names etc) http://icb.med.cornell.edu/wiki/index.php/Elementolab/ChIPseeqer_Evaluating_Conservation_Of_Distal_Peaks
  • the profile option of CSCon is not well documented right now http://icb.med.cornell.edu/wiki/index.php/Elementolab/ChIPseeqer_Evaluating_Conservation_Of_Distal_Peaks .. integrate better
  • why does CSCon need the human genome as option ?
  • add a tutorial page to explain "To run the tools directly from any folder, you need to add the $CHIPSEEQERDIR and $CHIPSEEQERDIR/SCRIPTS to your $PATH variable. " ...
  • remove '$' signs from the command lines in tutorial ... some people think they have to type $
  • CS: remove warnings when -inputdir not specified (keep otherwise)
  • CSTUT: users don't have to be in CS directory to use the program; we need to tell them to either use $CHIPSEEQERDIR or to add $CHIPSEEQERDIR and $CHIPSEEQERDIR/SCRIPTS to their $PATH
  • RENAME all scripts
  • CSANNOTATE: split Repeats/Cpg/Dups to another script
  • CSTUT: let's make a separate page for Supplementary tools instead of having them at http://icb.med.cornell.edu/wiki/index.php/Elementolab/ChIPseeqer_Tutorial
  • CS2FIRE: users should be able to set the RNG seed value .. right now they can either choose between seed = 1234 or use time()
  • CS2FIRE: instead of generating random sequences from genome, generate random sequences based on 1rst order Markov model learnt on peak sequences (needs testing)
  • CS2FIRE: start using genome indexing using SAMTOOLS library, as in SNVseeqer, to speed up (dramatically) peak sequence extraction
Personal tools