|
 | [ description ] |
Textractor is a software framework. The framework is designed to facilitate the development of software tools that process text to extract information. The focus of this framework is on applications that need to process large quantities of text, for instance large collections of full text articles (>10,000 articles). The development of textractor started at the Institute for Computational Biomedicine, Weill Cornell Medical College. Since we distribute the source code under the GPL, you are welcome to reuse or extend the framework in any way you like.
 | [ about the method ] |
Software architecture of the framework is unpublished and should be cited as Textractor http://icb.med.cornell.edu/crt/textractor/ (L. Shi and F. Campagne 2004).
 | [ documentation ] |
Supplementary material for the gene/gene product extraction article:
The names collected from the last quarter of JBC1999 by regular expressions for SVM training are below:
You can download the protein name lookup program here:
To use the lookup program, you will need Java 1.4+. Download the JAR file (tlookup.jar) and type java -jar tlookup.jar for usage information.
You can also download the source code, but will need a full fledged software development environment (JDK1.4+, Ant 1.6+) and will need a suitable JDO implementation (we developed with FastObjects and have not tested porting on another JDO implementation). By downloading this distribution, you agree to the terms of the Gnu General Public License.
The latest development snapshot of the source code archived on August 2nd 2006 is also available for download. This version requires JDK 1.5+ and a suitable JDO implementation.
A precompiled version archived on August 2nd, 2006 is also available for download. This version requires apache ant version 1.6.5 and acess to an oracle database to use.
Textractor API Documentation
Textractor is used in the following projects:
If you find this software useful, please let us know in a quick email.
|
|
|
|
|
 | |
March, 2012; Michelle Sahai, Ph.D., a Postdoctoral Associate in Harel Weinstein's lab, was awarded a three year Canadian Institutes of Health Research (CIHR) Fellowship for her research on Molecular Mechanisms of the Dopamine Transporter Function: The effects of drugs of abuse.
Feb, 2012; Sayan Mondal, a student in Harel Weinstein's lab, won the Student Research Achievement Award at the Biophysical Society's 2012 Annual Meeting for his poster on the interaction of GPCRs with the membrane.
Jan, 2012; Jan Dlabal, a student from the Lycée Français de New York, was selected as a semi-finalist in the 2012 Intel Science Talent Search, for work on the determination of large-scale genomic structure performed in the lab of Olivier Elemento.
Oct, 2011; Sheila Nirenberg presented a talk, "Can we speak the language of the brain?", at the TEDMED 2011 conference.
A Q & A session followed.
Nov, 2011; GobyWeb binary release. The Campagne laboratory has just released a binary distribution of GobyWeb. This first public release of GobyWeb makes it possible to install the tool locally for non-commercial use. Detailled installation instructions are available on the download page.
Apr, 2011; Dr. Olivier Elemento was awarded an NSF CAREER Grant, the National Science Foundation's most prestigious award in support of junior faculty who exemplify the role of teacher-scholars through outstanding research, excellent education and the integration of education and research.
Nov, 2010; Dr. Sheila Nirenberg's work on artificial retinas has been featured in Technology Review, Wired, Scientific American, and the New Scientist.
Jul, 2009; ChIPseeqer, a comprehensive framework for analysis of ChIP-seq data developed in the Elemento lab, is now available for download. [More]
[News Archives] [Mailing List]
|
|
 | | |
|
|
|