top of page


BioCodeKb - Bioinformatics Knowledgebase

It is a web interface available freely at UniProt for multiple sequence alignment using Clustal Omega. This can be used to find conserved residues and regions that can help infer evolutionary and functional relationships

It is basically used for the sequences analysis. It takes sequence as input in the form of FASTA format or UniProt identifiers. The results are shown in the graphical form.  Output is shown in Clustal foprmat. It can be easily run on Linux, Mac OS X and Microsoft Windows.

Select the Align tab of the toolbar to align two or more protein sequences with the Clustal Omega program. Enter either protein sequences in FASTA format or UniProt identifiers into the form field an input. Then click the Run Align button.

To execute the multiple sequence alignment of two proteins, enter the protein sequences in FASTA format or UniProt identifiers into the sequence input box provided and click ‘Run align’, for example enter ‘A4_Human’ and ‘A4_Mouse separated by a space or each on a new line.

To limit the range within a sequence, append the range in square brackets to the identifier. For example, P00750[1-10] represents the first ten amino acids of P00750.

Instead of entering identifiers into the form, you can collect sequences by clicking into the checkboxes next to them. Once two or more sequences have been marked, the Run Align button becomes available.

After we have submitted our data, a status page is shown. This page is reloaded in regular intervals until the alignment is complete. The final result page shows a colored version of the alignment and allows to download in Clustal format.

An alignment will display the following symbols denoting the degree of conservation observed in each column:

  • An * (asterisk) shows positions which have a single, fully conserved residue.

  • A : (colon) shows conservation between groups of strongly similar properties         scoring > 0.5 in the Gonnet PAM 250 matrix.

  • A . (period) indicates conservation between groups of weakly similar properties - scoring =< 0.5 in the Gonnet PAM 250 matrix.

Jobs have unique identifiers, which can be used in queries (e.g. to get the intersection of two sequence similarity searches) depending on the job type. Job identifiers and the related data are kept for 7 days, and are then deleted.

To add sequences to our alignment, a text box just after the alignment results allows us to do so, in FASTA format.

Moreover, ALIGN tool can be used for alignment via UniProt basket, UniProt results and UniProt ID Mapping.

The alignment of biological sequences is mostly the most important and most accomplished in the field of bioinformatics. Their overall goal is to know and analyze the similarity between different sequences, function prediction, producing phylogenetic trees and developing homology models of protein structures, classification of protein families, identification of horizontally transferred genes, and detection of recombined sequences, next-generation sequencing data processing and analysis etc.


Need to learn more about ALIGN and much more?

To learn Bioinformatics, analysis, tools, biological databases, Computational Biology, Bioinformatics Programming in Python & R through interactive video courses and tutorials, Join BioCode.

bottom of page