PAS

From Icbwiki

Revision as of 18:51, 13 October 2010; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

MasonLab

BAGET

BAGET_Tutorial

Usage

PAS provide poly-adenylation cleavage site (pas) analysis on the RNA-seq reads which contain putative polyA-tail. 1-50 bp upstream of putative polyA-tail will be scream for 13 pas motifs. The pas motif finding is strand-specific. Motif 1-13 stand for the following sequence, respectively:

Poly-adenylation cleavage motif [1]
Column Motif Frequency in homo sapiens (%)
1 AAUAAA 53.18
2 AUUAAA 16.78
3 UAUAAA 4.37
4 AGUAAA 3.72
5 AAGAAA 2.99
6 AAUAUA 2.13
7 AAUACA 2.03
8 CAUAAA 1.92
9 GAUAAA 1.75
10 AAUGAA 1.56
11 UUUAAA 1.20
12 ACUAAA 0.93
13 AAUAGA 0.60

The analysis of PAS is based the output of PolyA TBAG.

$ ./pas paslibrary.txt sample/pol_output sample/pas_output

Output Format

The output of PAS will include the motif number and the distance from the motif to the putative polyA-tail: "PAS_motif#distance;". "0" indicates that no pas motif was detected in 1-50 bp upstream of the putative polyA-tail. The following is the format of PAS output

chr     putative_pola_site    strand  gene      polyA_tail_base_num        read_counts     transcript#exon_num                     PAS_motif#distance
chr6    36763083              +       CDKN1A    5,                         1               CDKN1A.kApr07#3;CDKN1A.lApr07#3;        1#14;6#32;
chr6    34610762              +       PACSIN1   5,                         1               PACSIN1.bApr07#10;PACSIN1.aApr07#10;    0
chr7    49785779              +       VWC2      5,                         1               VWC2.aApr07#2;                          0
chr9    92412004              -       DIRAS2    10,                        1               DIRAS2.aApr07#2;                        3#19;

Each line contains:

Column Field Discription
1 chr chromosome number
2 putative_pola_site The 1-base in front of start position of the putative polyA-tail
3 strand The strand of the putative polyA-tail
4 gene The gene name of the reads with putative polyA-tail
5 polyA_tail_base_num The base number of the putative polyA-tail
6 read_counts The read counts of the putative polyA-tail site
7 transcript#exon_num The transcript ID of and exon number for the reads with putative polyA-tail; multiple ones seperated by ";"
8 PAS_motif#distance The poly-adenylation cleavage motif ID and the distance of the motif's last base to the putative polyA-tail

Reference

[1] Tian B, Hu J, Zhang H, Lutz CS. A large-scale analysis of mRNA polyadenylation of human and mouse genes. Nucleic Acids Res. 2005;33(1):201-12.

Personal tools