Next generation sequencing journal club

From Icbwiki

(Difference between revisions)
Jump to: navigation, search
Revision as of 19:04, 13 April 2011
Fabien Campagne (Talk | contribs)

← Previous diff
Revision as of 15:38, 31 May 2011
Fabien Campagne (Talk | contribs)

Next diff →
Line 17: Line 17:
* Feb 24 5-6PM. Dr. Altuna Akalin will present '''Widespread transcription at neuronal activity-regulated enhancers''', Tae-Kyung Kim et al. Nature 2010. * Feb 24 5-6PM. Dr. Altuna Akalin will present '''Widespread transcription at neuronal activity-regulated enhancers''', Tae-Kyung Kim et al. Nature 2010.
* April 14 5-6PM. Dr. Juan Rodriguez-Flores will present '''Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants.''' [http://www.nature.com/ng/journal/v42/n11/full/ng.680.html] Yingrui Li et al. Nat Genet 2010. * April 14 5-6PM. Dr. Juan Rodriguez-Flores will present '''Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants.''' [http://www.nature.com/ng/journal/v42/n11/full/ng.680.html] Yingrui Li et al. Nat Genet 2010.
 +* June 3rd 2011 5-6PM. Dr. Chris Mason and Fabien Campagne will present '''A framework for variation discovery and genotyping using next-generation DNA sequencing data''' [http://www.nature.com/ng/journal/v43/n5/full/ng.806.html] De Presto et al. Nat Genet 2011.
-The journal club does not follow a strict agenda for meeting dates. The best way to learn about the next discussion it join the mailing list. If you would like to subscribe to the mailing list, please email [mailto:fac2003@med.cornell.edu Fabien Campagne].+The journal club does not follow a strict agenda for meeting dates. The best way to learn about the next discussion it join the mailing list. If you would like to subscribe to the mailing list, please email [mailto:fac2003@med.cornell.edu Fabien Campagne]. If you attend the meetings, please volunteer to present papers you would like to discuss with the rest of the group.
Meetings will be in the ICB conference room. See [http://icb.med.cornell.edu/about/getthere.xml directions to the ICB]. Meetings will be in the ICB conference room. See [http://icb.med.cornell.edu/about/getthere.xml directions to the ICB].
ABSTRACT ABSTRACT
-'''Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants.'''  
-Yingrui Li et al. Nat Genet 2010. 
-Targeted capture combined with massively parallel exome sequencing is a promising approach to identify genetic variants implicated in human traits. We report exome sequencing of 200 individuals from Denmark with targeted capture of 18,654 coding genes and sequence coverage of each individual exome at an average depth of 12-fold. On average, about 95% of the target regions were covered by at least one read. We identified 121,870 SNPs in the sample population, including 53,081 coding SNPs (cSNPs). Using a statistical method for SNP calling and an estimation of allelic frequencies based on our population data, we derived the allele frequency spectrum of cSNPs with a minor allele frequency greater than 0.02. We identified a 1.8-fold excess of deleterious, non-syonomyous cSNPs over synonymous cSNPs in the low-frequency range (minor allele frequencies between 2% and 5%). This excess was more pronounced for X-linked SNPs, suggesting that deleterious substitutions are primarily recessive.+'''A framework for variation discovery and genotyping using next-generation DNA sequencing data''' [http://www.nature.com/ng/journal/v43/n5/full/ng.806.html] De Presto et al. Nat Genet 2011.
-http://www.nature.com/ng/journal/v42/n11/full/ng.680.html+ 
 +Recent advances in sequencing technology make it possible to comprehensively catalog genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious, and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (i) initial read mapping; (ii) local realignment around indels; (iii) base quality score recalibration; (iv) SNP discovery and genotyping to find all potential variants; and (v) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We here discuss the application of these tools, instantiated in the Genome Analysis Toolkit, to deep whole-genome, whole-exome capture and multi-sample low-pass (~4×) 1000 Genomes Project datasets.
 +http://www.nature.com/ng/journal/v43/n5/full/ng.806.html

Revision as of 15:38, 31 May 2011

The next-gen journal club meets the third Thursday of every month and discusses articles of interest to those analysing next-generation sequencing data. Topics of interest will include RNA-Seq, Chip-SEQ, targeted re-sequencing and other applications of next-generation sequencing to biological problems.

Future meeting times and topics will be listed here before each meeting.

  • Jan 21 4-5PM Dr. Olivier Elemento will present Human DNA methylomes at base resolution show widespread epigenomic differences Lyster R et al Nature Nov 2009.
  • Feb 18 4-5PM. Steve Lianoglou will present Biased Chromatin Signatures around Polyadenylation Sites and Exons Spies N et al Mol. Cell 2009.
  • March 25 4-5PM. Chris Mason will present Understanding mechanisms underlying human gene expression variation with RNA sequencing Pickrell et al, Nature 2010 doi:10.1038/nature08872 and Sequencing technologies - the next generation, Metzker ML Nat Rev Genet. 2010 Jan;11(1):31-46.
  • April 8 4-5PM. Dr. Fabien Campagne. Discussion of estimates of transcript expression and statistical tests recently applied to RNA-Seq data to detect differential expression. We will specifically discuss Bullard et al. BMC Bioinformatics 2010.
  • May 20 4-5PM. Dr. Stuart Andrews. Discussion of Yoseph Barash et al. Deciphering the splicing code. Nature, 465:7294, May 6, 2010.
  • June 17 4-5 PM. Dr. Weigang Qiu. Discussion of A human gut microbial gene catalogue established by metagenomic sequencing. Junjie Qin et al. Nature 2010.
  • Sept 16 5-6PM. Ms. Naysha Chambwe will present Conserved role of intragenic DNA methylation in regulating alternative promoters Alika K. Maunakea et al. Nature 2010 doi:10.1038/nature09165
  • Oct 7 5-6PM Dr. Eugenia Giannopoulou will present Mediator and cohesin connect gene expression and chromatin architecture Hagey AL et al Nature 2010
  • Nov 18 5-6PM Dr. Chris Mason will present A map of human genome variation from population-scale sequencing. The 1000 Genomes Project Consortium, Nature 2010.
  • Dec 15 5-6PM Mark Carty will present A Three-Dimensional Model of the Yeast Genome (Erez Lieberman-Aiden et al. Nature. 2010 May 20; 465(7296): 363–367. doi:10.1038/nature08973) and Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome (Zhijun Duan et al. Science 9 October 2009: Vol. 326 no. 5950 pp. 289-293 DOI: 10.1126/science.1181369).
  • Jan 13 5-6PM Dr. Doron Betel will present Statistical Design and Analysis of RNA Sequencing Data Paul L. Auer and R. W. Doerge. Genetics 2010.
  • Feb 3 5-6PM. Dr. Fabien Campagne will present Dindel: Accurate indel calls from short-read data. CA Albers, G Lunter, Daniel G MacArthur, Gilean McVean, Willem H Ouwehand, Richard Durbin. Genome Research 2010
  • Feb 24 5-6PM. Dr. Altuna Akalin will present Widespread transcription at neuronal activity-regulated enhancers, Tae-Kyung Kim et al. Nature 2010.
  • April 14 5-6PM. Dr. Juan Rodriguez-Flores will present Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants. [1] Yingrui Li et al. Nat Genet 2010.
  • June 3rd 2011 5-6PM. Dr. Chris Mason and Fabien Campagne will present A framework for variation discovery and genotyping using next-generation DNA sequencing data [2] De Presto et al. Nat Genet 2011.

The journal club does not follow a strict agenda for meeting dates. The best way to learn about the next discussion it join the mailing list. If you would like to subscribe to the mailing list, please email Fabien Campagne. If you attend the meetings, please volunteer to present papers you would like to discuss with the rest of the group.

Meetings will be in the ICB conference room. See directions to the ICB.

ABSTRACT

A framework for variation discovery and genotyping using next-generation DNA sequencing data [3] De Presto et al. Nat Genet 2011.

Recent advances in sequencing technology make it possible to comprehensively catalog genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious, and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (i) initial read mapping; (ii) local realignment around indels; (iii) base quality score recalibration; (iv) SNP discovery and genotyping to find all potential variants; and (v) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We here discuss the application of these tools, instantiated in the Genome Analysis Toolkit, to deep whole-genome, whole-exome capture and multi-sample low-pass (~4×) 1000 Genomes Project datasets. http://www.nature.com/ng/journal/v43/n5/full/ng.806.html

Personal tools