Maximum Parsimony is a character-based approach that infers a phylogenetic tree by minimizing the total number of evolutionary steps required to explain a given set of data assigned on the leaves. Exact solutions for optimizing parsimony scores on phylogenetic trees have been introduced in the past.

The parsimony method chooses a tree that has the fewest evolutionary changes or shortest overall branch lengths. It is based on a principle related to a medieval philosophy called Occam’s razor. The theory was formulated by William of Occam in the thirteenth century and states that the simplest explanation is probably the correct one. This is because the simplest explanation requires the fewest assumptions and the fewest leaps of logic. In dealing with problems that may have an inﬁnite number of possible solutions, choosing the simplest model may help to “shave off” those variables that are not really necessary to explain the phenomenon. By doing this, model development may become easier, and there may be less chance of introducing inconsistencies, ambiguities, and redundancies, hence, the name Occam’s razor.

The MP approach is in principle similar to the ME approach albeit the latter is distance based instead of character based.

Parsimony tree building works by searching for all possible tree topologies and reconstructing ancestral sequences that require the minimum number of changes to evolve to the current sequences. To save computing time, only a small number of sites that have the richest phylogenetic information are used in tree determination. These sites are the so-called informative sites, which are deﬁned as sites that have at least two different kinds of characters, each occurring at least twice. Informative sites are the ones that can often be explained by a unique tree topology. Other sites are non-informative, which are constant sites or sites that have changes occurring only once. Constant sites have the same state in all taxa and are obviously useless in evaluating the various topologies. The sites that have changes occurring only once are not very useful either for constructing parsimony trees because they can be explained by multiple tree topologies. The non-informative sites are thus discarded in parsimony tree construction. Once the informative sites are identiﬁed and the non-informative sites discarded, the minimum number of substitutions at each informative site is computed for a given tree topology. The total number of changes at all informative sites are summed up for each possible tree topology. The tree that has the smallest number of changes is chosen as the best tree. The key to counting a minimum number of substitutions for a particular site is to determine the ancestral character states at internal nodes. Because these ancestral character states are not known directly, multiple possible solutions may exist. In this case, the parsimony principle applies to choose the character states that result in a minimum number of substitutions. The inference of an ancestral sequence is made by ﬁrst going from the leaves to internal nodes and to the common root to determine all possible ancestral character states and then going back from the common root to the leaves to assign ancestral sequences that require the minimum number of substitutions.