Elementolab/SNVseeqer ExomeCaptureSeqAnalysis

From Icbwiki

Revision as of 18:09, 29 January 2011; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

Before you start:

  • make sure that you have the latest version of SNVseeqer and BIO-C
  • align your reads with BWA
  • determine read length (40bp?)
  • find the file that contains all your captured regions (bed file). Assume here it is allexons.txt

Assume that the .sam file obtained from BWA is s_1_sequence.txt.sam.

Step 1: split the reads into chrom files

mkdir s_1_sequence.txt.sam_SPLIT
$SNVSEEQERDIR/split_samfile -samfile s_1_sequence.txt.sam -outdir s_1_sequence.txt.sam_SPLIT

Step 2: Basic QC (note: this is now done as part of the main script using --docov=1)

$SNVSEEQERDIR/AnalyzeReseqGenomicData -readdir s_1_sequence.txt.sam_SPLIT -intervals allexons.txt -format sam s_1_sequence.txt.sam.cov

By default, this program also gives you nucleotide-level coverage (ie the number of position in the captured regions with given coverage (nt)).

Step 3: Run the full SNVseeqer analysis on the panda cluster

$SNVSEEQERDIR/PBS_SNPseeqerPB_Genome  --files="s_*.sam"  --detectreadlen=1   --uniquereads=1  --memory=48g --uselocalsplit=1 \ 
 --captured=all_exons.txt --docov=1 --doclonal=1 --submit=1

I usually start with --submit=0 to make sure that the commands look ok, then switch to --submit=1 to actually submit the script.

Personal tools