Elementolab/ChIPseeqer Annotate
From Icbwiki
ChIPseeqerAnnotate
In this analysis you can search the detected peaks for:
1.Go to the ChIPseeqer-1.0 directory
$ cd ChIPseeqer-1.0/
2. Type the command:
$ ./ChIPseeqerAnnotate --targets=TF_targets.txt --prefix=TF_targets_ANN --type=RepMasker
The following options are available:
--targets=FILE file containing genomic regions --suffix=STR suffix for output files --type=STR can be RepMasker, CpGislands or SegmentalDups
IMPORTANT: Note that in the --targets option you must enter the ChIPseeqer output file.
3. See the results. The output of this process are three files with the extensions:
_ALL.RM, .RM and .RM.stats _ALL.CpG, .CpG and .CpG.stats _ALL.DUP, .DUP and .DUP.stats
for each of the three different provided annotations.
- The files that end with _ALL.* will look like this:
chrY 2867287 2867611 0 chrY 2871627 2871971 2 chrY-2871327-2871629 chrY-2871816-2872114 chrY 2944779 2944956 1 chrY-2944529-2945113 chrY 5642923 5643407 2 chrY-5639836-5643224 chrY-5643229-5643472 chrY 6905840 6906263 0 chrY 6917898 6918357 1 chrY-6918198-6918315 chrY 6945356 6945877 0 chrY 7267607 7267819 1 chrY-7267753-7267949 chrY 7381389 7381767 2 chrY-7381323-7381478 chrY-7381493-7381672 chrY 7652223 7652533 0 chrY 7659894 7660062 1 chrY-7659990-7660187
Each row represents a detected ChIPseeqer peak, whereas the columns indicate:
Chromosome Start_Position End_Position Number_of_peaks_found [peaks_found]
- The files that end with .RM .CpG or .DUP will look like this:
chrY 2871627 2871971 2 chrY-2871327-2871629 chrY-2871816-2872114 chrY 2944779 2944956 1 chrY-2944529-2945113 chrY 5642923 5643407 2 chrY-5639836-5643224 chrY-5643229-5643472 chrY 6917898 6918357 1 chrY-6918198-6918315 chrY 7267607 7267819 1 chrY-7267753-7267949 chrY 7381389 7381767 2 chrY-7381323-7381478 chrY-7381493-7381672 chrY 7659894 7660062 1 chrY-7659990-7660187
This file is a filtered version of the previous one: Only the peaks that overlap with repeats, CpG islands or duplications are shown (no 0 in column 4).
IMPORTANT: The RepeatMasker output files also include the Repeat Name, Class and Family information for the repeats that overlap with the peaks. Foe example:
chrY 2871627 2871971 2 chrY-2871327-2871629:AluSx3 SINE SINE chrY-2871816-2872114:Kanga1a DNA DNA chrY 2944779 2944956 1 chrY-2944529-2945113:L1P1 LINE LINE chrY 5642923 5643407 2 chrY-5639836-5643224:MER83B-int LTR LTR chrY-5643229-5643472:HUERS-P1-int LTR LTR chrY 6917898 6918357 1 chrY-6918198-6918315:AluJr SINE SINE chrY 7267607 7267819 1 chrY-7267753-7267949:LTR36 LTR LTR chrY 7381389 7381767 2 chrY-7381323-7381478:AluY SINE SINE chrY-7381493-7381672:LTR43 LTR LTR chrY 7659894 7660062 1 chrY-7659990-7660187:MIRb SINE SINE chrY 8463843 8464207 2 chrY-8463743-8464149:LTR2 LTR LTR chrY-8464149-8466429:HERVE_a-int LTR LTR chrY 9067239 9067357 1 chrY-9065637-9067652:L1M4c LINE LINE chrY 9082527 9082798 1 chrY-9082496-9082771:AluSz SINE SINE
- The files that end with .stats summarize statistical information. For the RepMasker option the .stats file will look like this:
Number of peaks: 18814 Number of peaks with Repeats: 9680 %Repeats: 0.514510470925906 Name of repeats distribution MIRb: 967 (% 0.0668556415929204) MIR: 622 (% 0.0430033185840708) L2a: 548 (% 0.0378871681415929) L2c: 534 (% 0.0369192477876106) ... Class of repeats distribution SINE: 4814 (% 0.332826327433628) LINE: 4002 (% 0.276686946902655) LTR: 1859 (% 0.128525995575221) DNA: 1745 (% 0.12064435840708) Simple_repeat: 1124 (% 0.0777101769911504) Low_complexity: 679 (% 0.0469441371681416) tRNA: 79 (% 0.00546183628318584) Satellite: 42 (% 0.0029037610619469) Family of repeats distribution SINE: 4814 (% 0.332826327433628) LINE: 4002 (% 0.276686946902655) LTR: 1859 (% 0.128525995575221) DNA: 1745 (% 0.12064435840708) Simple_repeat: 1124 (% 0.0777101769911504) Low_complexity: 679 (% 0.0469441371681416) tRNA: 79 (% 0.00546183628318584) Satellite: 42 (% 0.0029037610619469)
whereas for the CpGislands and the SegmentalDups options the .stats file will look like this:
Number of peaks 18814 Number of Duplicates(/CpGs) 272 %Duplicates(/CpGs) 0.0144573190177527
