Email updates

Keep up to date with the latest news and content from BioData Mining and BioMed Central.

Open Access Research

Taxon ordering in phylogenetic trees by means of evolutionary algorithms

Francesco Cerutti12, Luigi Bertolotti12, Tony L Goldberg3 and Mario Giacobini12*

Author Affiliations

1 Department of Animal Production, Epidemiology and Ecology, Faculty of Veterinary Medicine, University of Torino, Via Leonardo da Vinci 44, 10095, Grugliasco (TO), Italy

2 Molecular Biotechnology Center, University of Torino, Via Nizza 52, 10126, Torino, Italy

3 Department of Pathobiological Sciences, School of Veterinary Medicine, University of Wisconsin-Madison, 1656 Linden Drive, Madison, Wisconsin, 53706, USA

For all author emails, please log on.

BioData Mining 2011, 4:20  doi:10.1186/1756-0381-4-20

Published: 1 July 2011

Abstract

Background

In in a typical "left-to-right" phylogenetic tree, the vertical order of taxa is meaningless, as only the branch path between them reflects their degree of similarity. To make unresolved trees more informative, here we propose an innovative Evolutionary Algorithm (EA) method to search the best graphical representation of unresolved trees, in order to give a biological meaning to the vertical order of taxa.

Methods

Starting from a West Nile virus phylogenetic tree, in a (1 + 1)-EA we evolved it by randomly rotating the internal nodes and selecting the tree with better fitness every generation. The fitness is a sum of genetic distances between the considered taxon and the r (radius) next taxa. After having set the radius to the best performance, we evolved the trees with (λ + μ)-EAs to study the influence of population on the algorithm.

Results

The (1 + 1)-EA consistently outperformed a random search, and better results were obtained setting the radius to 8. The (λ + μ)-EAs performed as well as the (1 + 1), except the larger population (1000 + 1000).

Conclusions

The trees after the evolution showed an improvement both of the fitness (based on a genetic distance matrix, then close taxa are actually genetically close), and of the biological interpretation. Samples collected in the same state or year moved close each other, making the tree easier to interpret. Biological relationships between samples are also easier to observe.