Elementolab/ChIPseeqer TODO
From Icbwiki
TODO
- CS2FIRE/J-FIRE: need to make J-FIRE available asap: CS2FIRE could have a useJFIRE=1 option;
- remove all requirements to use R anywhere in the framework
DONE
- CS2FIRE: --genome option inadequate ... many people will use --genome=mm9 and it would be treated as Drosophila genome
- create default distal category in CSA
- clean up QC, add clonal read detection program (or keep separate but remove the Perl code)
- move QC to main dir and rename to ChIPseeqerQC
- CSMotifMatch, input peakfile, runs CSAnnotate, combines peaks-genes in geneparts (P,I,E) and looks for pathways using the GO name (GO:000977) or any name included in a pathway (e.g., apoptosis, regulation)
- Print summary of parameters used in ChIPseeqer
- Kohonen support with visualization - and merge with CSCluster
- QC: need to rewrite clonal read analysis in C .. Perl way too slow ... generally add more QC tool like in HOMER
- need to add BAM support
- show actual output in http://icb.med.cornell.edu/wiki/index.php/Elementolab/ChIPseeqerGetReadCountInPeakRegions
- A tool like CSMotifMatch but for pathways - no enrichment involved (CSPathwayMatch)
- CSPAGE: improve transition and doc between CSAnnotate and CSPAGE. Should have a tool to merge columns of the main CSA output matrix
- make tool similar to http://icb.med.cornell.edu/wiki/index.php/Elementolab/ChIPseeqerPlotAverageReadDensityInGenes that shows where peaks (not reads) fall in promoters, exons, etc
- add support for ENSEMBL/UCSCGenes annotation
- create documentation for CSgetIntervalReadCounts (rename too) ... let's show in the documentation what the input is like
- add statistics (avg peak length, etc) to CS
- add illustrative fig to http://icb.med.cornell.edu/wiki/index.php/Elementolab/ChIPseeqerGetReadDensityProfiles
- fix (prog names etc) http://icb.med.cornell.edu/wiki/index.php/Elementolab/ChIPseeqer_Evaluating_Conservation_Of_Distal_Peaks
- the profile option of CSCon is not well documented right now http://icb.med.cornell.edu/wiki/index.php/Elementolab/ChIPseeqer_Evaluating_Conservation_Of_Distal_Peaks .. integrate better
- why does CSCon need the human genome as option ?
- add a tutorial page to explain "To run the tools directly from any folder, you need to add the $CHIPSEEQERDIR and $CHIPSEEQERDIR/SCRIPTS to your $PATH variable. " ...
- remove '$' signs from the command lines in tutorial ... some people think they have to type $
- CS: remove warnings when -inputdir not specified (keep otherwise)
- CSTUT: users don't have to be in CS directory to use the program; we need to tell them to either use $CHIPSEEQERDIR or to add $CHIPSEEQERDIR and $CHIPSEEQERDIR/SCRIPTS to their $PATH
- RENAME all scripts
- CSANNOTATE: split Repeats/Cpg/Dups to another script
- CSTUT: let's make a separate page for Supplementary tools instead of having them at http://icb.med.cornell.edu/wiki/index.php/Elementolab/ChIPseeqer_Tutorial
- CS2FIRE: users should be able to set the RNG seed value .. right now they can either choose between seed = 1234 or use time()
- CS2FIRE: instead of generating random sequences from genome, generate random sequences based on 1rst order Markov model learnt on peak sequences (needs testing)
- CS2FIRE: start using genome indexing using SAMTOOLS library, as in SNVseeqer, to speed up (dramatically) peak sequence extraction
