PAS
From Icbwiki
←Older revision | Newer revision→
Usage
PAS provide poly-adenylation cleavage site (pas) analysis on the RNA-seq reads which contain putative polyA-tail. 1-50 bp upstream of putative polyA-tail will be scream for 13 pas motifs. The pas motif finding is strand-specific. Motif 1-13 stand for the following sequence, respectively:
| Column | Motif | Frequency in homo sapiens (%) |
|---|---|---|
| 1 | AAUAAA | 53.18 |
| 2 | AUUAAA | 16.78 |
| 3 | UAUAAA | 4.37 |
| 4 | AGUAAA | 3.72 |
| 5 | AAGAAA | 2.99 |
| 6 | AAUAUA | 2.13 |
| 7 | AAUACA | 2.03 |
| 8 | CAUAAA | 1.92 |
| 9 | GAUAAA | 1.75 |
| 10 | AAUGAA | 1.56 |
| 11 | UUUAAA | 1.20 |
| 12 | ACUAAA | 0.93 |
| 13 | AAUAGA | 0.60 |
The analysis of PAS is based the output of PolyA TBAG.
$ ./pas paslibrary.txt sample/pol_output sample/pas_output
Output Format
The output of PAS will include the motif number and the distance from the motif to the putative polyA-tail: "PAS_motif#distance;". "0" indicates that no pas motif was detected in 1-50 bp upstream of the putative polyA-tail. The following is the format of PAS output
chr putative_pola_site strand gene polyA_tail_base_num read_counts transcript#exon_num PAS_motif#distance chr6 36763083 + CDKN1A 5, 1 CDKN1A.kApr07#3;CDKN1A.lApr07#3; 1#14;6#32; chr6 34610762 + PACSIN1 5, 1 PACSIN1.bApr07#10;PACSIN1.aApr07#10; 0 chr7 49785779 + VWC2 5, 1 VWC2.aApr07#2; 0 chr9 92412004 - DIRAS2 10, 1 DIRAS2.aApr07#2; 3#19;
Each line contains:
| Column | Field | Discription |
|---|---|---|
| 1 | chr | chromosome number |
| 2 | putative_pola_site | The 1-base in front of start position of the putative polyA-tail |
| 3 | strand | The strand of the putative polyA-tail |
| 4 | gene | The gene name of the reads with putative polyA-tail |
| 5 | polyA_tail_base_num | The base number of the putative polyA-tail |
| 6 | read_counts | The read counts of the putative polyA-tail site |
| 7 | transcript#exon_num | The transcript ID of and exon number for the reads with putative polyA-tail; multiple ones seperated by ";" |
| 8 | PAS_motif#distance | The poly-adenylation cleavage motif ID and the distance of the motif's last base to the putative polyA-tail |
Reference
[1] Tian B, Hu J, Zhang H, Lutz CS. A large-scale analysis of mRNA polyadenylation of human and mouse genes. Nucleic Acids Res. 2005;33(1):201-12.
