top of page

Sequence Analysis

BioCodeKb - Bioinformatics Knowledgebase

In bioinformatics, sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution. It includes;

  • Sequencing

  • Sequencing assembly

  • Alignment

  • Searching in Database


DNA sequencing is used to determine the sequence of individual genes, full chromosomes or entire genomes of an organism. DNA sequencing has also become the most efficient way to sequence RNA or proteins.


Sequence analysis involves DNA and protein sequencing. The methods for DNA sequencing involves;

  • Sanger Sequencing Method

  • Pyrosequencing Method

  • Shotgun Sequencing Method


The methiods for protein sequencing are;

  • Edmen Degradation

  • Mass spectrometery


Sequence assembly means cutting of sequences into small grafments then joined, merging and aligning them in reconstruction of the sequence.


Sequence alignment involves determination of correct location of insertions and deletions that have occurred in either of the two linages since the divergence from a common ancestor. Alignment could be pair-wise or multiple sequence alignment.


Databases for searching are;

  • UniGene at NCBI

  • DNA Data Bank of Japan

  • EMBL

  • SWISS-PROT and Tr-EMBL


Methodologies in sequence analysis used include sequence alignment, searches against biological databases, and others. Since the development of methods of high-throughput production of gene and protein sequences, the rate of addition of new sequences to the databases increased exponentially. Such a collection of sequences does not, by itself, increase the scientist's understanding of the biology of organisms. However, comparing these new sequences to those with known functions is a key way of understanding the biology of an organism from which the new sequence comes. Thus, sequence analysis can be used to assign function to genes and proteins by the study of the similarities between the compared sequences. Nowadays, there are many tools and techniques that provide the sequence comparisons (sequence alignment) and analyze the alignment product to understand its biology.


Sequence analysis importance in molecular biology includes;

  • The comparison of sequences in order to find similarity, often to infer if they are related (homologous)

  • Identification of intrinsic features of the sequence such as active sites, post translational modification sites, gene-structures, reading frames, distributions of introns and exons and regulatory elements

  • Identification of sequence differences and variations such as point mutations and single nucleotide polymorphism (SNP) in order to get the genetic marker.

  • Revealing the evolution and genetic diversity of sequences and organisms

  • Identification of molecular structure from sequence alone


Event sequence analysis methods are methods seeking patterns that recur across a number of sequences. The most common techniques for sequence analysis are optimal matching techniques, so-called because they operate by finding distances between pairs of sequences on the basis of an ‘optimal match’ of the two sequences, a minimal series of operations that transform one sequence into the other.

ad-scaled.webp

Need to learn more about BioCodeKB - Bioinformatics Knowledge... | BioCode and much more?

To learn Bioinformatics, analysis, tools, biological databases, Computational Biology, Bioinformatics Programming in Python & R through interactive video courses and tutorials, Join BioCode.

bottom of page