top of page

Analysis of regulation of Gene Expression

BioCodeKb - Bioinformatics Knowledgebase

Some genes are expressed continuously, as they produce proteins involved in basic metabolic functions; some genes are expressed as part of the process of cell differentiation; and some genes are expressed as a result of cell differentiation.

Transcription factors are proteins that play a role in regulating the transcription of genes by binding to specific regulatory nucleotide sequences.

The amounts and types of mRNA molecules in a cell show the function of that cell. In fact, thousands of transcripts are produced every second in every cell. Given this statistic, it is not surprising that the primary control point for gene expression is usually at the very beginning of the protein production process, the initiation of transcription. RNA transcription makes an efficient control point because many proteins can be made from a single mRNA molecule.

Transcript processing provides an additional level of regulation for eukaryotes, and the presence of a nucleus makes this possible. In prokaryotes, translation of a transcript begins before the transcript is complete, due to the proximity of ribosomes to the new mRNA molecules. In eukaryotes, however, transcripts are modified in the nucleus before they are exported to the cytoplasm for translation.

Of course, there are many cases in which cells must respond quickly to changing environmental conditions. In these situations, the regulatory control point may come well after transcription. For example, early development in most animals relies on translational control because very little transcription occurs during the first few cell divisions after fertilization. Eggs therefore contain many maternally originated mRNA transcripts as a ready reserve for translation after fertilization.

On the degradative side of the balance, cells can rapidly adjust their protein levels through the enzymatic breakdown of RNA transcripts and existing protein molecules. Both of these actions result in decreased amounts of certain proteins. Often, this breakdown is linked to specific events in the cell. The eukaryotic cell cycle provides a good example of how protein breakdown is linked to cellular events. This cycle is divided into several phases, each of which is characterized by distinct cyclin proteins that act as key regulators for that phase. Before a cell can progress from one phase of the cell cycle to the next, it must degrade the cyclin that characterizes that particular phase of the cycle. Failure to degrade a cyclin stops the cycle from continuing.

The following experimental techniques are used to measure gene expression and are listed in roughly chronological order, starting with the older, more established technologies.

Low-to-mid-plex techniques:

  • Reporter gene

  • Northern blot

  • Western blot

  • Fluorescent in situ hybridization

  • Reverse transcription PCR

  • Higher-plex techniques:

  • SAGE

  • DNA microarray

  • Tiling array

  • RNA-Seq

Gene expression databases

  • Gene expression omnibus (GEO) at NCBI

  • Expression Atlas at the EBI

  • Mouse Gene Expression Database at the Jackson Laboratory

  • CollecTF


  • Many Microbe Microarrays Database

The Modulated Gene Regulation Analysis algorithm (MGERA) was designed to explore gene regulation pairs modulated by the modulator. The regulation pairs can be identified by examining the coexpression of two genes based on Pearson correlation. Fisher transformation is adopted to normalize the correlation coefficients biased by sample sizes to obtain equivalent statistical power among data with different sample sizes. Statistical significance of difference in the absolute correlation coefficients between two genes is tested by the student test following Fisher transformation. For the gene pairs with significantly different coefficients between two genes, active and deactive statuses are identified by examining the modulated gene expression pairs (MGEPs). The MGEPs are further combined to construct the modulated gene regulation network for a systematic and comprehensive view of interaction under modulation.

Many additional processes add further complexity to gene expression regulation. Alternative pre-mRNA splicing generates an average of four transcript variants per human gene. Alternative translation initiation and termination can create additional variants. Once a protein is made, ~200 unique post-translational modifications, including phosphorylation, acetylation, ubiquitination, and SUMOylation, can be attached to target it for degradation, change its localization, interactions, and functions. Consequently, the Uniprot sequence database consists of >68,000 human protein variants, produced from just over 20,000 genes.


Need to learn more about Analysis of regulation of Gene Expression and much more?

To learn Bioinformatics, analysis, tools, biological databases, Computational Biology, Bioinformatics Programming in Python & R through interactive video courses and tutorials, Join BioCode.

bottom of page