Matcher

BioCodeKb - Bioinformatics Knowledgebase

Bioinformatics tools are very complex and critical software tools in life sciences. Bioinformatics toolsin today’s world range from simple basic tools to very large sophisticated software packages. In spiteof increasing complexity and sophistication of the bioinformatics tools, these tools are increasinglyused in the field of modern biology.


EMBOSS Matcher identifies local similarities in two input sequences using a rigorous algorithm based on Bill Pearson's lalign application.


Running a tool from the web form is a simple multiple steps process.


Step 1 - Input Sequences

First Input Sequence

A free text (raw) list of sequences is simply a block of characters representing several DNA/RNA or Protein sequences. A sequence can be in GCG, FASTA, EMBL (Nucleotide only), GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot (Protein only) format. Partially formatted sequences are not accepted. Adding a return to the end of the sequence may help certain applications understand the input. By directly using data from word processors may yield unpredictable results as hidden/control characters may be present.


First Sequence File Upload

A file containing valid sequences in any format (GCG, FASTA, EMBL (Nucleotide only), GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot (Protein only)) can be used as input for the sequence similarity search. Word processors files may yield unpredictable results as hidden/control characters may be present in the files. It is best to save files with the Unix format option to avoid hidden Windows characters.


Second Input Sequence

A free text (raw) list of sequences is simply a block of characters representing several DNA/RNA or Protein sequences. A sequence can be in GCG, FASTA, EMBL (Nucleotide only), GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot (Protein only) format. Partially formatted sequences are not accepted. Adding a return to the end of the sequence may help certain applications understand the input. By directly using data from word processors may yield unpredictable results as hidden/control characters may be present.


Second Sequence File Upload

A file containing valid sequences in any format (GCG, FASTA, EMBL (Nucleotide only), GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot (Protein only)) can be used as input for the sequence similarity search. Word processors files may yield unpredictable results as hidden/control characters may be present in the files. It is best to save files with the Unix format option to avoid hidden Windows characters.


Step 2 - Set alignment options

Matrix

Default substitution scoring matrices.

List Matrices

Default value (Protein) is: BLOSUM62 [EBLOSUM62]

Default value (Nucleotide) is: DNAfull [EDNAFULL]


Gap Open Penalty

Pairwise alignment score for the first residue in a gap. Values 1-25

Default value (protein) is: 14

Default value (Nucleotide) is: 16

Additional information Read more about gap penalties


Gap Extend Penalty

Pairwise alignment score for each additional residue in a gap. Values 1-8

Default value is: 4


Alternative matches

Show additional alignments. Values 1, 2, 3, 4, 5, 10, 20

Default value is: 1

Need to learn more about Matcher and much more?

To learn Bioinformatics, analysis, tools, biological databases, Computational Biology, Bioinformatics Programming in Python & R through interactive video courses and tutorials, Join BioCode.

Get in touch with us

Tel: +92 314 7785980

Email: Contact@BioCode.ltd

  • Black Instagram Icon
  • Facebook

© Copyright 2020 BioCode Ltd. - All rights reserved.