Piana

From Icbwiki

Revision as of 23:17, 8 May 2008; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

The icb maintains local installations of the Protein Interactions And Network Analysis (PIANA) database. The MySQL database resides on piana.med.cornell.edu and contains a locally populated version along with the sample "limited" data distribution.

Configuration and setup

The piana installation uses python 2.5 from softlib. At the present time, the required python modules for piana are only available on machines that are running the 64-bit version or RedHat Enterprise Linux 5. Descartes, rodin and piana are machines known to work with piana.

A bash script file is available will define environment variables and appropriated paths for using python with piana. The script is located at ~piana/bin/piana-setup.sh. When sourced, the following variables are added to your environment:

 PIANA_DIR - Directory of the piana source distribution
 PIANA_DBHOST - The host of the piana database instance (piana.med.cornell.edu)
 PIANA_DBNAME - The name of the piana database (piana or piana_limited)
 PIANA_DBUSER - Username to use when connecting to the piana database
 PIANA_DBPASS - Password to use when connecting to the piana database

Additionally, the value of PYTHONPATH will be set or modified to include the appropriate python packages required to interface with piana.

Database contents

The following table describes the contents of the piana database as of April 21st, 2008. The "section" column refers to the are in the populate piana documentation. Versions of the raw data sets are shown for both the data loaded by the icb, and the "limited" dataset provided for reference.

Section Name Local Version Limited Version Description
4.1.1 ncbi_taxonomy April 18 2008 April 03 2007 NCBI Species information
4.1.2 swissprot April 08 2008 April 03 2007 Uniprot manually curated database
4.1.2 trembl April 08 2008 April 03 2007 Uniprot complete not manually curated database
4.1.3 genpept April 18 2008 (release 165) April 2007 (release 158) NCBI genbank database
4.1.4 nr April 16 2008 April 2007 NCBI non-redundant database
4.1.5 ncbi2pdb_pdbaa April 16 2008 April 09 2007 Correspondence between pdb and gi identifiers (pdbaa)
4.1.6 ncbi2uniprot swissprot April 16 2008 April 09 2007 Correspondences between uniprot and gi identifiers (swissprot)
4.1.7 pdbsprotec March 12 2008 January 15 2007 Correspondences between pdb and uniprot identifiers (mapping.txt)
4.1.8 gene April 18 2008 April 19 2007 Correspondences between NCBI accession number and geneID identifiers (gene2accession)
4.1.9 gene_info April 18 2008 April 19 2007 Gene NCBI database (gene_info)
4.1.10 refseq March 13 2008 (release 28) release 22 NCBI RefSeq Database
4.1.11 cog-myva=gb September 26 2002 September 26 2002 Cluster of orthologous genes (myva=gb)
4.1.11 cog March 05 2003 March 05 2003 Cluster of orthologous genes (whog)
4.1.11 kog-kyva=gb June 06 2003 June 06 2003 Eucariotic Cluster of orthologous genes (kyva=gb)
4.1.11 kog July 21 2003 July 21 2003 Eucariotic Cluster of orthologous genes (kog)
4.1.12 scop Release 1.73 Release 1.71 Structural Classification of Proteins (SCOP)
4.1.13 go April 14 2008 April 2007 Gene Ontology (GO)
4.2.1 dip April 07 2008 February 19 2007 Database of Interacting Proteins (DIP)
4.2.2 mips June 01 2004 n/a The Mammalian Protein-Protein Interaction Database MIPS
4.2.3 hprd September 01 2007 (Release_7) n/a Human Protein Reference Database (HPRD}
4.2.4 bind April 15 2008 n/a Biomolecular Interaction Network Database (BOND)
4.2.5 intact March 29 2008 n/a IntAct protein-protein interaction database
4.2.6 biogrid 2.0.39 n/a The BioGRID's curated set of physical and genetic interactions
4.2.7 mint April 08 2008 n/a The Molecular INTeraction database (MINT
4.2.8 ori January 16 2007 January 16 2007 Predicted interactions from distant structure/sequence patterns interact.dat
Personal tools