Bayesian inference can be used to produce phylogenetic trees in a manner closely related to the maximum likelihood methods. Bayesian methods assume a prior probability distribution of the possible trees, which may simply be the probability of any one tree among all the possible trees that could be generated from the data, or may be a more sophisticated estimate derived from the assumption that divergence events such as speciation occur as stochastic processes. The choice of prior distribution is a point of contention among users of Bayesian-inference phylogenetics methods.
Implementations of Bayesian methods generally use Markov chain Monte Carlo sampling algorithms, although the choice of move set varies; selections used in Bayesian phylogenetics include circularly permuting leaf nodes of a proposed tree at each step and swapping descendant subtrees of a random internal node between two related trees. The use of Bayesian methods in phylogenetics has been controversial, largely due to incomplete specification of the choice of move set, acceptance criterion, and prior distribution in published work. Bayesian methods are generally held to be superior to parsimony-based methods; they can be more prone to long-branch attraction than maximum likelihood techniques, although they are better able to accommodate missing data.
Some tools that use Bayesian inference to infer phylogenetic trees from variant allelic frequency data (VAFs) include Canopy, EXACT, and PhyloWGS.
Bayesian analysis is to calculate posterior probability of two joint events by using the prior probability and conditional probability values using the following simplified formula:
Posterior probability=(Prior probability ∗ Conditional likelihood )/Total probability
In Bayesian tree selection, the prior probability is the probability for all possible topologies before analysis. The probability for each of these topologies is equal before tree building. The conditional probability is the substitution frequency of characters observed from the sequence alignment. These two pieces of information are used as a condition by the Bayesian algorithm to search for the most probable trees that best satisfy the so bservations. The tree search incorporates an iterative random sampling strategy based on the Markov chain Monte Carlo (MCMC) procedure. MCMC is designed as a “hill-climbing” procedure, seeking higher and higher likelihood scores while searching for tree topologies, although occasionally it goes downhill because of the random nature of the search. Overtime, high-scoring trees are sampled more often than low-scoring trees. When MCMC reaches high scored regions, a set of near optimal trees are selected to construct a consensus tree. In the end, the Bayesian method can achieve the same or even better performance than the complete ML method, but is much faster than regular ML and is able to handle very large datasets.
However, the Bayesian method produces hundreds or thousands of optimal or near-optimal trees with 88% to 90% probability to represent the reality. Thus, the latter approach has a better chance over all to guess the true tree correctly.