top of page


BioCodeKb - Bioinformatics Knowledgebase

PDBsum is an illustrated and graphical database that provides an allegedly overview of the contents of each 3D structure present in the Protein Data Bank (PDB). It shows the molecules that make up the structure such as protein chains, DNA, ligands and metal ions, and schematic diagrams of the interactions between them.

The original version of the database was developed around 1995 by Roman Laskowski and collaborators at University College London. As of 2014, PDBsum is maintained by Laskowski and collaborators in the laboratory of Janet Thornton at the European Bioinformatics Institute (EBI).

Each structure in the PDBsum database includes an image of structure, molecular components contained in the complex, enzyme reaction diagram if appropriate, Gene ontology functional assignments, a 1D sequence annotated by Pfam and InterPro domain assignments, description of bound molecules and graphic showing interactions between protein and secondary structure, schematic diagrams of protein–protein interactions, analysis of clefts contained within the structure and links to external databases. The RasMol and Jmol molecular graphics software are used to provide a 3D view of molecules and their interactions within PDBsum.

Since the release of the 1000 Genomes Project in October 2012, all single amino acid variants identified by the project have been mapped to the corresponding protein sequences in the Protein Data Bank. These variants are also displayed within PDBsum, cross-referenced to the relevant UniProt identifier. PDBsum have a number of protein structures which may be of interest in structure-based drug design. One branch of PDBsum is DrugPort that focuses on these models and is linked with the DrugBank drug target database.

Entries are accessed either by their 4-character PDB code, or by one of the two search boxes provided on the PDBsum home page;

  1. text search, which searches a given string against the TITLE, HEADER, COMPND, SOURCE and AUTHOR records in the PDB.

  2. sequence search, which performs a FASTA search of a given protein sequence against the sequences of all proteins in the PDB.

We can also use any of the Browse options given on the PDBsum home page:

  • Highlights: which tabulates some of the extreme structures in the database in terms of age, size, etc

  • List of PDB codes: which lists all entries by their 4-character code

  • Het Groups: lists all Het Groups in the PDB with links to the structures that contain them

  • Ligands: as for Het Groups, but considering each bound molecule as a separate entity

  • Drugs: lists the drug molecules found in structures of the PDB.

  • Enzymes: lists all enzyme structures in the PDB classed by the hierarchical E.C. numbering scheme.

  • ProSite: lists all PROSITE sequence patterns and the PDB entries that contain them

  • Pfam: lists the PDB entries containing structural information for any Pfam domain

Some of its most recent features include figures from the structure's key reference, citation data, Pfam domain diagrams, topology diagrams and protein-protein interactions.


Need to learn more about PDBSUM and much more?

To learn Bioinformatics, analysis, tools, biological databases, Computational Biology, Bioinformatics Programming in Python & R through interactive video courses and tutorials, Join BioCode.

bottom of page