Structural alignment is a form of sequence alignment based on comparison of shape. These alignments attempt to establish equivalences between two or more polymer structures based on their shape and three-dimensional conformation.
This process is usually applied to protein tertiary structures but can also be used for large RNA molecules.
In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires not a prior knowledge of equivalent positions.
Structural alignment is a valuable tool for the comparison of proteins with low sequence similarity, where evolutionary relationships between proteins cannot be easily detected by standard sequence alignment techniques.
Structural alignment can therefore be used to imply evolutionary relationships between proteins that share very little common sequence.
Protein structural alignment provides a fundamental basis for deriving principles of functional and evolutionary relationships. It is routinely used for structural classification and functional characterization of proteins and for the construction of sequence alignment benchmarks. However, the available techniques do not fully consider the implications of protein structural diversity and typically generate a single alignment between sequences.
Structural alignments can compare two sequences or multiple sequences. Because these alignments rely on information about all the query sequences' three-dimensional conformations, the method can only be used on sequences where these structures are known. These are usually found by X-ray crystallography or NMR spectroscopy. It is possible to perform a structural alignment on structures produced by structure prediction methods. Indeed, evaluating such predictions often requires a structural alignment between the model and the true known structure to assess the model's quality. Structural alignments are especially useful in analyzing data from structural genomics and proteomics efforts, and they can be used as comparison points to evaluate alignments produced by purely sequence-based bioinformatics methods.
The outputs of a structural alignment are a superposition of the atomic coordinate sets and a minimal root mean square deviation (RMSD) between the structures. The RMSD of two aligned structures shows their divergence from one another. Structural alignment can be complicated by the existence of multiple protein domains within one or more of the input structures, because changes in relative orientation of the domains between two structures to be aligned can artificially inflate the RMSD.
The minimum information produced from a successful structural alignment is a set of residues that are considered equivalent between the structures. This set of equivalences is then typically used to superpose the three-dimensional coordinates for each input structure.
Structural comparison and alignment softwares;
MAMMOTH
DaliLite
TM-align
SCALI
A common and popular structural alignment method is the DALI, or Distance-matrix ALIgnment method, which breaks the input structures into hexapeptide fragments and calculates a distance matrix by evaluating the contact patterns between successive fragments.
The combinatorial extension (CE) method is similar to DALI in that it too breaks each structure in the query set into a series of fragments that it then attempts to reassemble into a complete alignment. A series of pairwise combinations of fragments called aligned fragment pairs, or AFPs, are used to define a similarity matrix through which an optimal path is generated to identify the final alignment.
The SSAP (Sequential Structure Alignment Program) method uses double dynamic programming to produce a structural alignment based on atom-to-atom vectors in structure space.