The SNVseeqer package identifies single nucleotide variants (SNVs) from deep-sequencing data and provides comprehensive annotation of the findings (e.g., determine which amino acid substitutions they give rise to, which genes or RNAs they fall into, effects on transcription factor binding, etc). The programs are applicable but not restricted to RNA-seq, DNA-seq, and Exon-seq data.
SNVseeqer is still under development, but we can make it available to you if you want to try it. Send us an email at email@example.com
SNVseeqer requires a modern C/C++ compiler and works in Linux and MacOS environments.
Elemento lab members and collaborators can get the latest source code from our SVN server:
svn co https://pbtech-vc.med.cornell.edu/public/svn/elementolab/SNPseeqer/trunk SNPseeqer/ svn co https://pbtech-vc.med.cornell.edu/public/svn/elementolab/BIO-C/trunk BIO-C/ svn co https://pbtech-vc.med.cornell.edu/public/svn/elementolab/PERL_MODULES/trunk PERL_MODULES/
Also, install the PERL compatible regular expression module (PCRE) from http://www.pcre.org/. Follow the instructions, compile and install the library.
Compilation and installation
Once you have obtained the code (either as a zip file or via SVN), you will need to type the following commands:
cd SNPseeqer/ make # platform-dependent alternatives: # make -f Makefile.mac # make -f Makefile.linux echo export SNVSEEQERDIR=`pwd` >> ~/.bashrc # add environment variable to startup script; note the inverted quotes ! source ~/.bashrc
Genomic data files
To install the full SNVseeqer framework, you will need to download the following files:
- Human genome (release 18)
cd REFDATA wget http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/chromFa.zip unzip chromFa.zip cat chr*.fa > wg.fa echo export HG18DIR=`pwd` >> ~/.bashrc cd ..
- dbSNP release 130
cd REFDATA wget http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/snp130.txt.gz gunzip snp130.txt.gz perl ../SCRIPTS/split_dbsnp_file.pl snp130.txt echo export DBSNPDIR=`pwd` >> ~/.bashrc cd ..
The above steps added the environment variables to ~/.bashrc so you don't have to tell SNVseeqer where the program and data are every time you run the analysis. E.g. your ~/.bashrc should contain something like:
export SNPSEEQERDIR=path_to_your_SNPseeqer_directory export HG18DIR=path_to_your_SNPseeqer_directory/REFDATA export DBSNPDIR=path_to_your_SNPseeqer_directory/REFDATA
- Optional: base-level conservation scores from PhyloP.
These scores are available from from http://hgdownload.cse.ucsc.edu/goldenPath/hg18/phyloP44way/placentalMammals/
mkdir CONSERVATION cd CONSERVATION
The easiest way to download the conservation scores is to use lftp:
lftp http://hgdownload.cse.ucsc.edu/goldenPath/hg18/phyloP44way/placentalMammals/ lftp> mget * lftp> exit
gzip -d *.gz
You are now ready to use SNVseeqer. Please take a look at the tutorial for more information on how to use the program.
If you need to add support for the transcriptome of a new genome build or a new species, please visit SNVseeqer/Adding a new annotation database.