KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets produced by genome sequencing and other high-throughput experimental technologies. In particular, gene catalogs from completely sequenced genomes are linked to higher-level systemic functions of the cell, the organism and the ecosystem.
KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation.
KEGG PATHWAY is the reference database for pathway mapping in KEGG Mapper.
Major efforts have been undertaken to manually create a knowledge base for the systemic functions by capturing and organizing experimental knowledge in computable forms, namely, in the forms of molecular networks called KEGG pathway maps, BRITE functional hierarchies and KEGG modules. Regular efforts have also been made to develop and improve the cross-species annotation procedure for linking genomes to the molecular networks through the KEGG Orthology (KO) system.
As the result, KEGG is widely used as a reference knowledge base for integration and interpretation of large-scale datasets generated by genome sequencing.
The KEGG database has been developedt by Kanehisa Laboratories since 1995, and is now a prominent reference knowledge base. KEGG contains eighteen databases. They are broadly categorized into systems information, genomic information, chemical information and health information, which are distinguished by color coding of web pages.
These databases have many data objects for computer representation of the biological systems. Thus, the database entry of each database is called the KEGG object, which is identified by the KEGG object identifier consisting of a database-dependent prefix and a five-digit number.
These databases constitute the reference knowledge base for biological interpretation of genomes and high-throughput molecular datasets through the process of KEGG mapping.
The KEGG object identifer or simply the KEGG identifier (kid) is a unique identifier for each KEGG object, which is also the database entry identifier. Generally it takes the form of a prefix followed a five-digit number.
Advantages of using KEGG
Provides health information
Provides chemical information
Provides genomic information
Give us pathway maps for cellular and organismal functions
Showing modules or functional units of genes
Provides hierarchical classifications of biological entities
Information on Metabolism
Genetic information processing
Environmental information processing (membrane transport, signal transduction, etc.)
Cellular processes (cell growth, cell death, cell membrane functions, etc.)
Organismal systems (immune system, endocrine system, nervous system, etc.)
Human diseases
Drug development
maintains the gene catalogs for all organisms with completely sequenced genomes and selected organisms with partial genomes
maintains the catalog of chemical elements, compounds, and other substances in living cells as the LIGAND database and they are again linked to pathway components
providing new informatics technologies that combine genomic and functional information towards predicting biological systems and designing further experiments
protein-protein interaction