MIR

From Icbwiki

Jump to: navigation, search

MasonLab

BAGET

BAGET_Tutorial

Usage

MIR provide miRNA predicted targeting site analysis at 3'UTR. It can analyze the RNA-seq reads with putative polyA-tail and see if there is any miRNA targeting site changes from a 3'UTR length changes with the longest form of 3'-UTR. We analyze 711 miRNAs' targeting site in mirA. The miRNA targeting site finding is strand-specific.

There are two format of output for MIR. One is format T: the miRNA targeting site is in table. So the miRNA table would be 711 columns. This format is convenient for statistical software to read-in. Another is format R: the miRNA targeting site information is in the style of "miRNA#count;". This format is easier for user to directly read.

$ cd TBAG/

For format T, the command is:

$ ./mir T AceView/ human_miRbase.txt miRlist sample/pol_output mir_output

For format R, the command is:

$ ./mir R AceView/ human_miRbase.txt miRlist sample/pol_output mir_output

Output Format

The format of MIR output is following. In the 7th column, "F" stands for the full length 3'UTR of the according transcript, "1P" stands for the truncated 3'UTR by the first putative polyA tail site of the according transcript and "1D" stands for the miRNA targeting difference between "F" and "P". If there are multiple putative sites in the 3'UTR of certain transcript, there could be "2P", "2D", "3P", "3D"...

Format T
chr	strand	gene_name	transcriptID	putative_polyA	3'UTR_locus	3'UTR_type	3'UTR_exon_num	polyA_tail_base_num	read_counts	hsa-let-7a	hsa-let-7a*	hsa-let-7b	...
chr6	+	CDKN1A		CDKN1A.kApr07	36763083	36761553-36763094	F	3		0			0		1	0	0	...
chr6	+	CDKN1A		CDKN1A.kApr07	36763083	36761553-36763094	1D	3		5,			1		0	0	0	...
chr6	+	CDKN1A		CDKN1A.kApr07	36763083	36761553-36763094	1P	3		5,			1		1	0	0	...
Format R
chr	strand	gene_name	transcriptID	putative_polyA	3'UTR_locus				3'UTR_type	3'UTR_exon_num	polyA_tail_base_num	read_counts	miRNA#counts;
chr19	+	SNRP70   	SNRP70.eApr07	54303543	54303510-54303681			F		3	0	0	hsa-miR-7-1*#1;hsa-miR-589*#1;hsa-miR-139-3p#1;hsa-miR-523#1;hsa-miR-7-2*#1;hsa-miR-423-5p#2;hsa-miR-362-3p#1;hsa-miR-611#1;hsa-miR-612#1;hsa-miR-590-3p#1;hsa-miR-614#1;hsa-miR-362-5p#1;hsa-miR-768-5p#1;hsa-miR-95#1;hsa-miR-296-3p#1;hsa-miR-873#1;hsa-miR-421#1;hsa-miR-922#1;hsa-miR-339-3p#1;hsa-miR-432#1;hsa-miR-936#1;hsa-miR-525-3p#1;hsa-miR-92a-1*#1;hsa-miR-524-3p#1;hsa-miR-672#1;hsa-miR-885-3p#1;hsa-miR-214#1;
chr19	+	SNRP70	 	SNRP70.eApr07	54303543	54303510-54303681			1D		3	5,	1	hsa-miR-7-1*#1;hsa-miR-589*#1;hsa-miR-139-3p#1;hsa-miR-523#1;hsa-miR-7-2*#1;hsa-miR-423-5p#2;hsa-miR-362-3p#1;hsa-miR-611#1;hsa-miR-612#1;hsa-miR-590-3p#1;hsa-miR-614#1;hsa-miR-362-5p#1;hsa-miR-768-5p#1;hsa-miR-95#1;hsa-miR-296-3p#1;hsa-miR-873#1;hsa-miR-421#1;hsa-miR-339-3p#1;hsa-miR-432#1;hsa-miR-525-3p#1;hsa-miR-92a-1*#1;hsa-miR-524-3p#1;hsa-miR-672#1;hsa-miR-885-3p#1;
chr19	+	SNRP70		SNRP70.eApr07	54303543	54303510-54303681			1P		3	5,	1	hsa-miR-214#1;hsa-miR-922#1;hsa-miR-936#1;
chr6	+	PACSIN1		PACSIN1.bApr07	34610762	34608283-34611307;34611378-34612016;	F		10-11	0	0	hsa-miR-323-3p#2;
chr6	+	PACSIN1		PACSIN1.bApr07	34610762	34608283-34611307;34611378-34612016;	1D		10-10	5,	1	
chr6	+	PACSIN1		PACSIN1.bApr07	34610762	34608283-34611307;34611378-34612016;	1P		10-10	5,	1	hsa-miR-323-3p#2;

Each line contains:

Column Field Discription
1 chr chromosome number
2 strand The strand of the putative polyA-tail
3 gene_name The gene name of the reads with putative polyA-tail
4 transcriptID The transcript ID of the reads with putative polyA-tail
5 putative_pola_site The 1-base in front of start position of the putative polyA-tail
6 3'UTR_locus The full version of 3'UTR "start position-stop position"; 3'UTR expand multiple exons will be separated by ";"
7 3'UTR_type Indicate different 3'UTR form data: "F":the full length 3'UTR of the according transcript;

"1P": the truncated 3'UTR by the first putative polyA tail site of the according transcript; "1D": the miRNA targeting difference between "F" and "P". Multiple putative sites in the 3'UTR of certain transcript could be present as "2P", "2D", "3P", "3D"...

8 3'UTR_exon_num The exon number covered by the 3'UTR
9 polyA_tail_base_num The putative polyA-tail base number
10 read_counts The read counts at the 1-base in front of start position of the putative polyA-tail
11 miRNA targeting information Format T: 711 column of different miRNAs' targeting information using the number of targeting site within the 3'UTR for each miRNA;

Format R: 1 column for miRNA and its targeting site number in for the 3'UTR. Multiple miRNA are separated by ";"

Personal tools