Elementolab/SNVseeqer Install

From Icbwiki

Jump to: navigation, search

Elementolab/

Contents

Introduction

The SNVseeqer package identifies single nucleotide variants (SNVs) from deep-sequencing data and provides comprehensive annotation of the findings (e.g., determine which amino acid substitutions they give rise to, which genes or RNAs they fall into, effects on transcription factor binding, etc). The programs are applicable but not restricted to RNA-seq, DNA-seq, and Exon-seq data.

SNVseeqer is still under development, but we can make it available to you if you want to try it. Send us an email at ole2001@med.cornell.edu

System Requirements

SNVseeqer requires a modern C/C++ compiler and works in Linux and MacOS environments.

Getting SNVseeqer

Elemento lab members and collaborators can get the latest source code from our SVN server:

svn co https://pbtech-vc.med.cornell.edu/public/svn/elementolab/SNPseeqer/trunk SNPseeqer/ 
svn co https://pbtech-vc.med.cornell.edu/public/svn/elementolab/BIO-C/trunk BIO-C/
svn co https://pbtech-vc.med.cornell.edu/public/svn/elementolab/PERL_MODULES/trunk PERL_MODULES/ 

Also, install the PERL compatible regular expression module (PCRE) from http://www.pcre.org/. Follow the instructions, compile and install the library.

Compilation and installation

Once you have obtained the code (either as a zip file or via SVN), you will need to type the following commands:

cd SNPseeqer/
make
# platform-dependent alternatives: 
#  make -f Makefile.mac
#  make -f Makefile.linux 
echo export SNVSEEQERDIR=`pwd` >> ~/.bashrc      # add environment variable to startup script; note the inverted quotes !
source ~/.bashrc

Genomic data files

To install the full SNVseeqer framework, you will need to download the following files:

  • Human genome (release 18)
cd REFDATA
wget http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/chromFa.zip
unzip chromFa.zip
cat chr*.fa > wg.fa
echo export HG18DIR=`pwd` >> ~/.bashrc
cd ..
  • dbSNP release 130
cd REFDATA
wget http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/snp130.txt.gz
gunzip snp130.txt.gz
perl ../SCRIPTS/split_dbsnp_file.pl snp130.txt
echo export DBSNPDIR=`pwd` >> ~/.bashrc
cd ..

The above steps added the environment variables to ~/.bashrc so you don't have to tell SNVseeqer where the program and data are every time you run the analysis. E.g. your ~/.bashrc should contain something like:

export SNPSEEQERDIR=path_to_your_SNPseeqer_directory
export HG18DIR=path_to_your_SNPseeqer_directory/REFDATA
export DBSNPDIR=path_to_your_SNPseeqer_directory/REFDATA


  • Optional: base-level conservation scores from PhyloP.

These scores are available from from http://hgdownload.cse.ucsc.edu/goldenPath/hg18/phyloP44way/placentalMammals/

mkdir CONSERVATION
cd CONSERVATION

The easiest way to download the conservation scores is to use lftp:

 lftp http://hgdownload.cse.ucsc.edu/goldenPath/hg18/phyloP44way/placentalMammals/
 lftp> mget *
 lftp> exit

Then

 gzip -d *.gz

Next...

You are now ready to use SNVseeqer. Please take a look at the tutorial for more information on how to use the program.

If you need to add support for the transcriptome of a new genome build or a new species, please visit SNVseeqer/Adding a new annotation database.

Personal tools