Course home > How to use entrez

Entrez allows you to search several linked databases to retrieve not only DNA and protein sequences, but also 3-D structures and human disorders. The databases available are:

  • The PubMed database: provides access to biomedical citations and abstracts in MEDLINE, PreMEDLINE, and other related databases, with links to participating online journals.
  • The Nucleotide database: contains sequence data from GenBank, EMBL, and DDBJ, GSDB and the members of the tripartite international collaboration of sequence databases.
  • The Protein database: contains sequence data from the translated coding regions from DNA sequences in GenBank, EMBL and DDBJ as well as protein sequences submitted to PIR, SWISSPROT, PRF, Protein Data Bank (PDB).
  • The Genomes database: provides views for a variety of genomes, complete chromosomes, assembled sequence maps, and integrated genetic and physical maps.
  • The Structure database or Molecular Modeling Database (MMDB): contains experimental data from crystallographic and NMR structure determinations. The data for MMDB are obtained from the Protein Data Bank (PDB).
  • The PopSet database: contains aligned sequences submitted as a set resulting from a population, a phylogenetic, or mutation study describing such events as evolution and population variation.
  • The Taxonomy database: contains the names of all organisms that are represented in the genetic databases with at least one nucleotide or protein sequence.
  • The Online Mendelian Inheritance in Man database: contains a catalog of human genes and genetic disorders.

To do a quick search, you can enter a query into the text-entry box on the main page, and click on "Go". This will give you the same page, which will now list the number of matches found for your query next to every database. However, for more complicated searches, the "Limits", "Preview / Index" and "History" options are used. You can reach this option by clicking on any of the database links. "Limits" allows the user to restrict searches. A search can either be restricted to a particular database field (e.g., Author name, or gene name) or anything but a particular field, or to search only a particular set of data (e.g., only genes located in the mitochondrion). Because different databases have different sets of fields, and different sets of data, the limits for searching each particular database also change. The limits available for each database search can be found here.

The "Preview / Index" option allows the user to sequentially narrow down searches. The "Index" checkbox gives the user an alphabetical list of terms from searchable database fields. For example, if you change the pulldown menu to display "Accession" and click on the "Index" button, a list of accession numbers appears which can be browsed. If you input some text into the text-entry box and then click "Index", the list presented will begin alphabetically from whatever text you typed. You can then select specific terms from the Index list, and click on one of the Boolean buttons (AND, OR, NOT) to add that term to your search. As with "Limits", the indexes available for a particular database are dependent on the searchable fields of that database.

Once you input a term into the text-entry box, you can click "Preview" to see how many entries fit your search criteria. If there are too many, you can add more terms to your query into the text-entry box, and again clicking "Preview". All consecutive searches will be joined by whatever Boolean operator button you choose, but the default is AND. Your entire query will be shown in the topmost text-entry box, and can be changed there also.

"History" provides a record of the searches performed during a search session. They can be used to review, revise or combine the results of earlier searches.

In any text-entry box, subject terms are automatically combined using the Boolean operator AND. To search for a phrase, double quotes should be inserted around the phrase. If, however, the phrase is not found in the phrase list, Entrez will treat the terms as if they were two separate terms joined by AND. It is also possible to use asterisks as wild cards, so that Entrez searches for any term beginning with your input query term (unless the next character is a space).

For more help with the Entrez site, see the Entrez help manual.

News
Jul, 2009; ChIPseeqer, a comprehensive framework for analysis of ChIP-seq data developed in the Elemento lab, is now available for download. [More]
Apr, 2009; The BDVal program developed by the Campagne laboratory for MAQC-II is now available from http://bdval.org. The software supports the development and evaluation of predictive biomarker models from high-throughput data. The web site offers binary and source distributions. [More]
Jan, 2009; Twease now supports searching MEDLINE articles by Author, Journal, and Publication Year. Examples for performing these searches can be found in the updated Twease tutorial. [More]

[News Archives] [Mailing List]


Events
Dec 11th; 4:00pm-5:00pm: Institute for Computational Biomedicine Research in Progress Seminar Series - Fabien Campagne; ICB Conference Room - Y.1301
Jan 15th; 4:00pm-5:00pm: Institute for Computational Biomedicine Research in Progress Seminar Series - Lei Shi; ICB Conference Room - Y.1301
Feb 12th; 4:00pm-5:00pm: Institute for Computational Biomedicine Research in Progress Seminar Series - Christopher E. Mason; ICB Conference Room - Y.1301
Mar 12th; 4:00pm-5:00pm: Institute for Computational Biomedicine Research in Progress Seminar Series - Olivier Elemento; ICB Conference Room - Y.1301
Apr 9th; 4:00pm-5:00pm: Institute for Computational Biomedicine Research in Progress Seminar Series - Emre Aksay; ICB Conference Room - Y.1301
May 14th; 4:00pm-5:00pm: Institute for Computational Biomedicine Research in Progress Seminar Series - Jonathan D. Victor; ICB Conference Room - Y.1301
Jun 11th; 4:00pm-5:00pm: Institute for Computational Biomedicine Research in Progress Seminar Series - Harel Weinstein; ICB Conference Room - Y.1301
Jul 9th; 4:00pm-5:00pm: Institute for Computational Biomedicine Research in Progress Seminar Series - Duane Hassane; ICB Conference Room - Y.1301