Elementolab/RNA-seq RPKMs

From Icbwiki

Jump to: navigation, search

Back to Elementolab/

Computing RPKM values from RNA-seq data

Related pages:

Steps:

1. Aligning the raw sequence data (FASTQ or FASTA format) to annotated RefSeq mRNAs

  • fastq.gz -> bwa

2. Creating alignment in the SAM format

  • bwa -> sam

3. Converting RefSeq-based alignments to genome-based alignments

  • sam -> genome.sam

4. Converting SAM to BAM format

  • genome.sam -> genome.bam

5. Creating sorted-by-coordinated BAM file

  • genome.bam -> genome.sorted.bam

6. Indexing the sorted BAM file

  • genome.sorted.bam -> genome.sorted.bam.bai

7. Computing RPKM (reads per kilobase per million mapped reads) expression values from a sorted BAM file

  • genome.sorted.bam.bai -> RNAseq_RPKMs
                  fastq.gz -> bwa -> sam -> genome.sam -> genome.bam -> genome.sorted.bam -> genome.sorted.bam.bai -> RNAseq_RPKMs

Commands:

1. Aligning the raw sequence data (FASTQ or FASTA format) to annotated RefSeq mRNAs

bwa aln -t 4 ~/bwa/RefSeqbwaidx RnaSeq_data.fastq.gz > RnaSeq_data.bwa

2. Creating alignment in the SAM format (a generic format for storing large nucleotide sequence alignments)

bwa samse ~/bwa/RefSeqbwaidx RnaSeq_data.bwa RnaSeq_data.fastq.gz > RnaSeq_data.sam

3. Converting RefSeq-based alignments to genome-based alignments

./~/geneModel align -cmd ref2g -i RnaSeq_data.sam -o RnaSeq_data.genome.sam

4. Converting SAM to BAM format (binary format)

samtools import ~/bwa/wg.fa.fai RnaSeq_data.genome.sam RnaSeq_data.genome.bam

5. Creating sorted-by-coordinated BAM file

samtools sort RnaSeq_data.genome.bam RnaSeq_data.genome.sorted.bam

6. Indexing the sorted BAM file

samtools index RnaSeq_data.genome.sorted.bam                              #this will create an indexed bam file called RnaSeq_data.genome.sorted.bam.bai

7. Computing RPKM (reads per kilobase per million mapped reads) expression values from a sorted BAM file

./~/geneModel calcexp -cmd bamrpkm –i RnaSeq_data.genome.sorted.bam --uniq –o RnaSeq_data_RPKM.txt  

                -i    input sorted bam file
                -o    output file containing RPKM expression values
Personal tools