Bioinformatics tools are very complex and critical software tools in life sciences. Bioinformatics toolsin today’s world range from simple basic tools to very large sophisticated software packages. In spiteof increasing complexity and sophistication of the bioinformatics tools, these tools are increasinglyused in the field of modern biology.
EMBOSS Matcher identifies local similarities in two input sequences using a rigorous algorithm based on Bill Pearson's lalign application.
Running a tool from the web form is a simple multiple steps process.
Step 1 - Input Sequences
First Input Sequence
A free text (raw) list of sequences is simply a block of characters representing several DNA/RNA or Protein sequences. A sequence can be in GCG, FASTA, EMBL (Nucleotide only), GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot (Protein only) format. Partially formatted sequences are not accepted. Adding a return to the end of the sequence may help certain applications understand the input. By directly using data from word processors may yield unpredictable results as hidden/control characters may be present.
First Sequence File Upload
A file containing valid sequences in any format (GCG, FASTA, EMBL (Nucleotide only), GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot (Protein only)) can be used as input for the sequence similarity search. Word processors files may yield unpredictable results as hidden/control characters may be present in the files. It is best to save files with the Unix format option to avoid hidden Windows characters.
Second Input Sequence
A free text (raw) list of sequences is simply a block of characters representing several DNA/RNA or Protein sequences. A sequence can be in GCG, FASTA, EMBL (Nucleotide only), GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot (Protein only) format. Partially formatted sequences are not accepted. Adding a return to the end of the sequence may help certain applications understand the input. By directly using data from word processors may yield unpredictable results as hidden/control characters may be present.
Second Sequence File Upload
A file containing valid sequences in any format (GCG, FASTA, EMBL (Nucleotide only), GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot (Protein only)) can be used as input for the sequence similarity search. Word processors files may yield unpredictable results as hidden/control characters may be present in the files. It is best to save files with the Unix format option to avoid hidden Windows characters.
Step 2 - Set alignment options
Matrix
Default substitution scoring matrices.
List Matrices
Default value (Protein) is: BLOSUM62 [EBLOSUM62]
Default value (Nucleotide) is: DNAfull [EDNAFULL]
Gap Open Penalty
Pairwise alignment score for the first residue in a gap. Values 1-25
Default value (protein) is: 14
Default value (Nucleotide) is: 16
Additional information Read more about gap penalties
Gap Extend Penalty
Pairwise alignment score for each additional residue in a gap. Values 1-8
Default value is: 4
Alternative matches
Show additional alignments. Values 1, 2, 3, 4, 5, 10, 20
Default value is: 1