top of page

Bioinformatics Tool Development

BioCodeKb - Bioinformatics Knowledgebase

The development of bioinformatics tools is an essential part of molecular biological research and the focus of our work. In order to empower experimental scientists, the MPI Bioinformatics Toolkit has been developed which integrates state-of-the-art software for protein sequence and structure analysis into an intuitively usable platform. At this time toolkit consists of 55 applications, most of which are interconnected, allowing job results to be forwarded between applications.

When starting a programming project, the complexity of the task may at first seem overwhelming, especially for an inexperienced programmer. What helps is to divide the project into smaller tasks. To do this, we wrote down required features, also called User Stories. Each User Story describes something that has an added value for users; it should be brief enough to fit on a note pad.

When using User Stories, we kept them permanently visible (e.g. on a pin board). Completed tasks were moved to a separate ‘done’ section, allowing to grasp progress immediately.

When developing a verbal definition of program features into an implementation, it is essential to know exactly what kind of data the program will use. A practical approach is to gather explicit input files that the program can be run with. It is easier to use simplified examples for which the prospective output is known than big real-life data sets.

After dividing the functionality of a program into smaller units, the same can be done for a program's architecture. In practice this means to define components (e.g. classes, modules, packages, etc.) and to assign responsibilities to them. One way to document the architecture as it develops is to write class-responsibility-collaboration (CRC) cards, which consist of a single page for each component.

The unified modelling language (UML) is a sophisticated instrument used in the software industry to formalize a technical system. It is capable of illustrating complex program subsystems graphically. Of these, we have used only Use Case and Class diagrams. Use Case diagrams represent ‘What a system does’, using actors, their goals (Use Cases), and dependencies between them. The diagram helps discussing a program outline among programmers or with the principal investigator.

A code repository (also termed version control) manages files that are being used and modified by different persons. Programmers can use it to exchange program code and other files systematically. The code repository keeps track of which is the most recent version of a particular file and notices whenever two persons try to change the same version of a file and prevents changes that overwrite each other.

To produce reproducible scientific results using software, it is required to test, protocol what has been done, and to recognize and track program bugs. A repository is crucial for controlling exactly which version of a software, which set of input files, and which parameters were used to produce a specific set of results, and at what time. When a bug is found especially if it is several years old, it is practically impossible to rebuild a previous state of the program without a repository.

Commonly used code repositories are Concurrent Versions System (CVS), Subversion (SVN) and more recently GIT and Mercurial.

To enhance the readability of source code and avoid common coding errors, a document with coding guidelines can be used. Such guidelines define a convention for naming variables, functions, classes and files, where and how to write comments and exclude certain language constructs.

Tests are necessary to check whether a given implementation is correct. Insufficient testing hampers development of many programs.

There are various ways of testing code automatically and there exist dedicated testing utilities.

BioinfoLytics Company

Our company BioinfoLytics is affliated with BioCode and is a project, where we are providing many topics on Genomics, Proteomics, their analysis using many tools in a better and advance way, Sequence Alignment & Analysis, Bioinformatics Scripting & Software Development, Phylogenetic and Phylogenomic Analysis, Functional Analysis, Biological Data Analysis & Visualization, Custom Analysis, Biological Database Analysis, Molecular Docking, Protein Structure Prediction and Molecular Dynamics etc. for the seekers of Biocode to further develop their interest to take part in these services to fulfill their requirements and obtain their desired results. We are providing such a platform where one can find opportunity to learn, research projects analysis and get help and huge knowledge based on molecular, computational and analytical biology.

We are providing “Bioinformatics Tool Development” service to our researchers and seekers to develop methodology and analysis tools to explore large volumes of biological data, helping to store, organize, systematize, annotate, visualize, query, mine, understand, and interpret complex data volumes.


Need to learn more about Bioinformatics Tool Development and much more?

To learn Bioinformatics, analysis, tools, biological databases, Computational Biology, Bioinformatics Programming in Python & R through interactive video courses and tutorials, Join BioCode.

bottom of page