Elementolab/SNVseeqer ExomeCaptureSeqAnalysis
From Icbwiki
Before you start:
- make sure that you have the latest version of SNVseeqer and BIO-C
- align your reads with BWA
- determine read length (40bp?)
- find the file that contains all your captured regions (bed file). Assume here it is allexons.txt
Assume that the .sam file obtained from BWA is s_1_sequence.txt.sam.
Step 1: split the reads into chrom files
mkdir s_1_sequence.txt.sam_SPLIT $SNVSEEQERDIR/split_samfile -samfile s_1_sequence.txt.sam -outdir s_1_sequence.txt.sam_SPLIT
Step 2: Basic QC (note: this is now done as part of the main script using --docov=1)
$SNVSEEQERDIR/AnalyzeReseqGenomicData -readdir s_1_sequence.txt.sam_SPLIT -intervals allexons.txt -format sam s_1_sequence.txt.sam.cov
By default, this program also gives you nucleotide-level coverage (ie the number of position in the captured regions with given coverage (nt)).
Step 3: Run the full SNVseeqer analysis on the panda cluster
$SNVSEEQERDIR/PBS_SNPseeqerPB_Genome --files="s_*.sam" --detectreadlen=1 --uniquereads=1 --memory=48g --uselocalsplit=1 \ --captured=all_exons.txt --docov=1 --doclonal=1 --submit=1
I usually start with --submit=0 to make sure that the commands look ok, then switch to --submit=1 to actually submit the script.