Elementolab/ChIPseeqer Summary

From Icbwiki

(Difference between revisions)
Jump to: navigation, search
Revision as of 20:22, 28 October 2009
Ole2001 (Talk | contribs)

← Previous diff
Revision as of 04:53, 29 October 2009
Ole2001 (Talk | contribs)
(ChIPseeqerSummary)
Next diff →
Line 1: Line 1:
== ChIPseeqerSummary == == ChIPseeqerSummary ==
-In this step you can create a file that contains a '''gene-based annotation of the detected peak locations.'''+In this analysis you can create a file that contains a '''gene-based annotation of the detected peak locations.'''
In particular, the script: In particular, the script:

Revision as of 04:53, 29 October 2009

ChIPseeqerSummary

In this analysis you can create a file that contains a gene-based annotation of the detected peak locations.

In particular, the script:

  • finds if/which promoter regions of the genes in RefGene overlap with the detected peaks
  • extracts the NM trascript names for each of these genes from RefGene
  • extracts the ORF name and description from RefLink


1.Go to the ChIPseeqer-1.0 directory

$ cd ChIPseeqer-1.0/

2. Type the command:

$ ./ChIPseeqerSummary --targets=TF_targets.txt --lenu=1000 -lend=500 --prefix=TF_targets_SUM

The following options are available:

--targets=FILE file containing genomic regions
--lenu=INT     length upstream of TSS
--lend=INT     length downstream of TSS
--suffix=STR   suffix for output files

IMPORTANT: Note that in the --targets option you must enter the ChIPseeqer output file.

3. See the results. The output of this process are three files with the extensions: _ALL.NM, .NM and .SUM

  • The file that ends with _ALL.NM will look like this:
NM_201266	chr2	206254468	206255967	0
NR_027685	chr17	5262684		5264183		2	chr17-5262760-5263300	chr17-5264120-5264368
NM_001145290	chr11	124437222	124438721	1	chr11-124437939-124438230
NM_018087	chr1	54076264	54077763	1	chr1-54076648-54077169
NM_016252	chr2	32434599	32436098	0
NM_001012415	chr9	137730696	137732195	0
NM_181886	chr4	103967744	103969243	1	chr4-103968310-103968772
NM_001097595	chrX	52533130	52534629	0

Each row represents a transcript form RefGene, whereas the columns indicate:

TranscriptID	Chromosome	Transcription_Start_Position	Transcription_End_Position	Number_of_peaks_found	[peaks_found]
  • The file that ends with .NM will look like this:
NM_001079559	chr11	62250898	62252397	1	chr11-62252383-62252996
NM_001143965	chr6	13436250	13437749	1	chr6-13436046-13436609
NM_024800	chr3	132227383	132228882	2	chr3-132227552-132227676	chr3-132228056-132228356
NM_003262	chr3	171166273	171167772	3	chr3-171166199-171166351	chr3-171166471-171166714	chr3-171166829-171167327
NM_001098536	chr12	6830545		6832044		1	chr12-6831636-6832665
NM_014712	chr16	30875115	30876614	1	chr16-30876052-30876909
NM_018465	chr9	5427361		5428860		2	chr9-5427196-5427574	chr9-5428733-5429129
NM_001135662	chr1	204010734	204012233	1	chr1-204010920-204011505

This file is a filtered version of the previous one: Only the transcripts with detected peaks are shown (no 0 in column 5).

  • The file that ends with .SUM will look like this:
FBXO38	F-box protein 38 isoform b			chr5	147742738	147744237	1	chr5-147743301-147743418
RPL27	ribosomal protein L27				chr17	38402971	38404470	1	chr17-38403540-38403898
ARID5B	AT rich interactive domain 5B (MRF1-like)	chr10	63330448	63331947	3	chr10-63329398-63331241	chr10-63331378-63331644	 chr10-63331804-63333783
INSIG1	insulin induced gene 1 isoform 1		chr7	154719475	154720974	1	chr7-154719658-154719817
ARID1B	AT rich interactive domain 1B (SWI1-like)	chr6	157139777	157141276	1	chr6-157140112-157140358
USP5	ubiquitin specific peptidase 5 isoform 2	chr12	6830551		6832050		1	chr12-6831636-6832665
CDCA7	cell division cycle associated 7 isoform 1	chr2	173926806	173928305	1	chr2-173927314-173927479
FZD1	frizzled 1 precursor				chr7	90730718	90732217	1	chr7-90731027-90731494

Each row represents a gene form RefLink, whereas the columns indicate:

GeneID	GeneDescription	Chromosome	Transcription_Start_Position	Transcription_End_Position	Number_of_peaks_found	[peaks_found]


IMPORTANT: In the future, this part of the analysis will also provide several statistical results.

Personal tools