<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1756-0381-1-6</ui>
   <ji>1756-0381</ji>
   <fm>
      <dochead>Review</dochead>
      <bibl>
         <title>
            <p>A review of estimation of distribution algorithms in bioinformatics</p>
         </title>
         <aug>
            <au ca="yes" id="A1">
               <snm>Arma&#241;anzas</snm>
               <fnm>Rub&#233;n</fnm>
               <insr iid="I1"/>
               <email>ruben@si.ehu.es</email>
            </au>
            <au id="A2">
               <snm>Inza</snm>
               <fnm>I&#241;aki</fnm>
               <insr iid="I1"/>
               <email>inza@si.ehu.es</email>
            </au>
            <au id="A3">
               <snm>Santana</snm>
               <fnm>Roberto</fnm>
               <insr iid="I1"/>
               <email>rsantana@si.ehu.es</email>
            </au>
            <au id="A4">
               <snm>Saeys</snm>
               <fnm>Yvan</fnm>
               <insr iid="I2"/>
               <insr iid="I3"/>
               <email>yvan.saeys@ugent.be</email>
            </au>
            <au id="A5">
               <snm>Flores</snm>
               <mnm>Luis</mnm>
               <fnm>Jose</fnm>
               <insr iid="I1"/>
               <email>joseluis.flores@ehu.es</email>
            </au>
            <au id="A6">
               <snm>Lozano</snm>
               <mnm>Antonio</mnm>
               <fnm>Jose</fnm>
               <insr iid="I1"/>
               <email>lozano@si.ehu.es</email>
            </au>
            <au id="A7">
               <snm>Peer</snm>
               <mnm>Van de</mnm>
               <fnm>Yves</fnm>
               <insr iid="I2"/>
               <insr iid="I3"/>
               <email>yves.vandepeer@psb.ugent.be</email>
            </au>
            <au id="A8">
               <snm>Blanco</snm>
               <fnm>Rosa</fnm>
               <insr iid="I4"/>
               <email>rosa.blanco@unavarra.es</email>
            </au>
            <au id="A9">
               <snm>Robles</snm>
               <fnm>V&#237;ctor</fnm>
               <insr iid="I5"/>
               <email>vrobles@fi.upm.es</email>
            </au>
            <au id="A10">
               <snm>Bielza</snm>
               <fnm>Concha</fnm>
               <insr iid="I6"/>
               <email>mcbielza@fi.upm.es</email>
            </au>
            <au id="A11">
               <snm>Larra&#241;aga</snm>
               <fnm>Pedro</fnm>
               <insr iid="I6"/>
               <email>pedro.larranaga@fi.upm.es</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Computer Science and Artificial Intelligence, University of the Basque Country, Donostia &#8211; San Sebasti&#225;n, Spain</p>
            </ins>
            <ins id="I2">
               <p>Department of Plant Systems Biology, Ghent University, Ghent, Belgium</p>
            </ins>
            <ins id="I3">
               <p>Department of Molecular Genetics, Ghent University, Ghent, Belgium</p>
            </ins>
            <ins id="I4">
               <p>Department of Statistics and Operations Research, Public University of Navarre, Pamplona, Spain</p>
            </ins>
            <ins id="I5">
               <p>Departamento de Arquitectura y Tecnolog&#237;a de Sistemas Inform&#225;ticos, Universidad Polit&#233;cnica de Madrid, Madrid, Spain</p>
            </ins>
            <ins id="I6">
               <p>Departamento de Inteligencia Artificial, Universidad Polit&#233;cnica de Madrid, Madrid, Spain</p>
            </ins>
         </insg>
         <source>BioData Mining</source>
         <issn>1756-0381</issn>
         <pubdate>2008</pubdate>
         <volume>1</volume>
         <issue>1</issue>
         <fpage>6</fpage>
         <url>http://www.biodatamining.org/content/1/1/6</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18822112</pubid>
               <pubid idtype="doi">10.1186/1756-0381-1-6</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>18</day>
               <month>1</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>11</day>
               <month>9</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>11</day>
               <month>9</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Arma&#241;anzas et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain.</p>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Introduction</p>
         </st>
         <p>As a consequence of increased computational power in the last decades, evolutionary search algorithms emerged as important heuristic optimization techniques in the early eighties. Evolutionary optimization techniques have demonstrated their potential across a broad spectrum of areas such as transportation, machine learning or industry. Based on the development of current high-throughput data capturing devices in biotechnology, a wide range of high-dimensional optimization problems surfaced in the field of bioinformatics and computational biology over the last decade. Because classic optimization techniques only explore a limited portion of the solution space, researchers soon realized that sequential search engines that try to improve a single solution are clearly insufficient to move through these huge search spaces. The use of population-based, randomized search engines was proposed as an alternative that would overcome these limitations and be better able to explore the vast solution space. Evolutionary optimization techniques, of which genetic algorithms (GAs) are the most well known class of techniques, have thus been the method of choice for many of these bioinformatics problems.</p>
         <p>Estimation of distribution algorithms (EDAs) are a novel class of evolutionary optimization algorithms that were developed as a natural alternative to genetic algorithms in the last decade. The principal advantages of EDAs over genetic algorithms are the absence of multiple parameters to be tuned (e.g. crossover and mutation probabilities) and the expressiveness and transparency of the probabilistic model that guides the search process. In addition, EDAs have been proven to be better suited to some applications than GAs, while achieving competitive and robust results in the majority of tackled problems. In this review, we focus on a group of pioneering papers that have shown the power of the EDA paradigm in a set of recent bioinformatic, mainly genomic and proteomic, tasks. For each problem, we give a brief description, the EDA used, and the associated literature references. The solution representation and the cardinality of the search space are also discussed in some cases. Before discussing these problems, the next section presents what an EDA is and how it works, sets out a detailed taxonomy based on their main features and what potential they have within the bioinformatic discipline.</p>
      </sec>
      <sec>
         <st>
            <p>Estimation of distribution algorithms</p>
         </st>
         <p>Estimation of distribution algorithms <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp> are evolutionary algorithms that work with a multiset (or population sets) of candidate solutions (points). Figure <figr fid="F1">1</figr> illustrates the flow chart for any EDA approach. Initially, a random sample of points is generated. These points are evaluated using an objective function. An objective function evaluates how accurate each solution is for the problem. Based on this evaluation, a subset of points is selected. Hence, points with better function values have a bigger chance of being selected.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>EDA algorithm flow chart (Figure 1-EDAChart.eps)</p>
            </caption>
            <text>
               <p><b>EDA algorithm flow chart (Figure 1-EDAChart.eps)</b>. Diagram of how an estimation of distribution algorithm works. This overview of the algorithm is further specified by the pseudocode shown in Table 1.</p>
            </text>
            <graphic file="1756-0381-1-6-1"/>
         </fig>
         <p>Then, a probabilistic model of the selected solutions is built, and a new set of points is sampled from the model. The process is iterated until the optimum has been found or another termination criterion is fulfilled.</p>
         <p>For more details, Table <tblr tid="T1">1</tblr> sets out the pseudocode that implements a basic EDA. The reader can find a complete running example of an EDA in <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>.</p>
         <tbl id="T1">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>EDA pseudocode</p>
            </caption>
            <tblbdy cols="1">
               <r>
                  <c ca="left">
                     <p>Set <it>t </it>&#8592; 0. Generate <it>M </it>points randomly</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>
                        <b>Do</b>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left" indent="1">
                     <p>Evaluate the points using the fitness function</p>
                  </c>
               </r>
               <r>
                  <c ca="left" indent="1">
                     <p>Select a set <it>S </it>of <it>N </it>&#8804; <it>M </it>points according to a selection method</p>
                  </c>
               </r>
               <r>
                  <c ca="left" indent="1">
                     <p>Estimate a probabilistic model for <it>S</it></p>
                  </c>
               </r>
               <r>
                  <c ca="left" indent="1">
                     <p>Generate <it>M </it>new points sampling from the distribution represented in the model</p>
                  </c>
               </r>
               <r>
                  <c ca="left" indent="1">
                     <p><it>t </it>&#8592; <it>t </it>+ 1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p><b>until </b>Termination criteria are met</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>Estimation of distribution algorithms: evolutionary computation based on learning and simulation of probabilistic graphical models.</p>
            </tblfn>
         </tbl>
         <sec>
            <st>
               <p>Characteristics of EDAs</p>
            </st>
            <p>Essentially EDAs assume that it is possible to build a model of the promising areas of the search space, and use this model to guide the search for the optimum. In EDAs, modeling is achieved by building a probabilistic graphical model that represents a condensed representation of the features shared by the selected solutions. Such a model can capture different patterns of interactions between subsets of the problem variables, and can conveniently use this knowledge to sample new solutions.</p>
            <p>Probabilistic modeling gives EDAs an advantage over other evolutionary algorithms that do not employ models, such as GAs. These algorithms are generally unable to deal with problems where there are important interactions among the problems' components. This, together with EDAs' capacity to solve different types of problems in a robust and scalable manner <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B5">5</abbr></abbrgrp>, has led to EDAs sometimes also being referred to as competent GAs <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>A taxonomy of EDAs</p>
            </st>
            <p>Since several EDAs have been proposed with a variety of models and learning algorithms, the selection of the best EDA to deal with a given optimization problem is not always straightforward. One criterion that could be followed in this choice is to trade off the complexity of the probabilistic model against the computational cost of storing and learning the selected model. Both issues are also related to the problem dimensionality (i.e. number of variables) and to the type of representation (e.g. discrete, continuous, mixed).</p>
            <p>Researchers should be aware that simple models generally have minimal storage requirements, and are easy to learn. However, they have a limited capacity to represent higher-order interactions. On the other hand, more complex models, which are able to represent more involved relationships, may require sophisticated data structures and costly learning algorithms. The impact that the choice between simple and more complex models has in the search efficiency will depend on the addressed optimization problem. In some cases, a simple model can help to reach non-optimal but acceptable solutions in a short time. In other situations, e.g. deceptive problems, an EDA that uses a simple model could move the search away from the area of promising solutions.</p>
            <p>Another criterion that should be taken into consideration to choose an EDA is whether there is any previous knowledge about the problem structure, and which kind of probabilistic model is best suited to represent this knowledge. The following classification of EDAs is intended to help the bioinformatic researcher to find a suitable algorithm for his or her application.</p>
            <p>EDAs can be broadly divided according to the complexity of the probabilistic models used to capture the interdependencies between the variables: univariate, bivariate or multivariate approaches. Univariate EDAs, such as PBIL <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, cGA <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and UMDA <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, assume that all variables are independent and factorize the joint probability of the selected points as a product of univariate marginal probabilities. Consequently, these algorithms are the simplest EDAs and have also been applied to problems with continuous representation <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>.</p>
            <p>The bivariate models can represent low order dependencies between the variables and be learnt using fast algorithms. MIMIC <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, the bivariate marginal distribution algorithm BMDA <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>, dependency tree-based EDAs <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> and the tree-based estimation of distribution algorithm (Tree-EDA) <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> are all members of this subclass. The latter two use tree and forest-based factorizations, respectively. They are recommended for problems with a high cardinality of the variables and where interactions are known to play an important role. Trees and forests can also be combined to represent higher-order interactions using models based on mixtures of distributions <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>.</p>
            <p>Multivariate EDAs factorize the joint probability distribution using statistics of order greater than two. Figure <figr fid="F2">2</figr> shows some of the different probabilistic graphical models covered by this category. As the number of dependencies among the variables is higher than in the above categories, the complexity of the probabilistic structure, as well as the computational effort required to find the structure that best suits the selected points, is greater. Therefore, these approaches require a more complex learning process. Some of the EDA approaches based on multiply connected Bayesian networks are:</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>EBNA and BOA paradigms (Figure 2-EBNA-BOA.eps)</p>
               </caption>
               <text>
                  <p><b>EBNA and BOA paradigms (Figure 2-EBNA-BOA.eps)</b>. Diagram of probability models for the proposed EDAs in combinatorial optimization with multiple dependencies (FDA, EBNA, BOA, and EcGA).</p>
               </text>
               <graphic file="1756-0381-1-6-2"/>
            </fig>
            <p>&#8226; The (Factorized Distribution Algorithm) FDA <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> is applied to additively decomposed functions for which, using the running intersection property, a factorization of the mass-probability based on residuals and separators is obtained.</p>
            <p>&#8226; In <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, a factorization of the joint probability distribution encoded by a Bayesian network is learnt from the selected set in every generation. The estimation of Bayesian network algorithm (EBNA) uses the Bayesian information criterion (BIC) score as the quality measure for the Bayesian network structure. The space of models is searched using a greedy algorithm.</p>
            <p>&#8226; The Bayesian optimization algorithm (BOA) <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> is also based on the use of Bayesian networks. The Bayesian Dirichlet equivalent metric is drawn on to measure the goodness of every structure. The algorithm enacts a greedy search procedure. BOA has been improved by adding dependency trees and restricted tournament replacement. The resulting, more advanced, hierarchical BOA (hBOA) <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> is one of the EDAs for which extensive experimentation has been undertaken. The results show good scalability behavior.</p>
            <p>&#8226; The extended compact Genetic Algorithm (EcGA) proposed in <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> is an algorithm in which the basic idea is to factorize the joint probability distribution as a product of marginal distributions of variable size.</p>
            <p>There are alternatives to the use of Bayesian networks for representing higher order interactions in EDAs. Markov network-based EDAs <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp> could be an appropriate choice for applications where the structure of the optimization problem is known and can be easily represented using an undirected graphical model. EDAs that use dependency networks <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> can encode dependencies that Bayesian networks cannot represent. Both classes of algorithms need relatively complex sampling procedures based on the use of Gibbs sampling <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>.</p>
            <p>In addition to the order of complexity encoded by the probability model, there is another key feature when dealing with an EDA algorithm: the way that model is learned. There are two alternatives: induce the model structure and its associated parameters, or induce just the set of parameters for <it>an a priori </it>given model. The first class is denoted as <it>structure</it>+<it>parameter learning</it>, whereas the second is known as <it>parameter learning</it>. Both approaches need to induce the parameters of their models, but the first approach's need for structural learning makes it more time consuming. By contrast, parameter learning is dependent on the fixed model, whereas structure+parameter learning exhibits a greater power of generalization.</p>
            <p>Population-based incremental learning (PBIL) <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, the compact GA (cGA) <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, the univariate marginal distribution algorithm (UMDA) <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> and the factorized distribution algorithm (FDA) <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> which use a fixed model of interactions in all generations, are all parameter approaches. On the other hand, the mutual information maximization for input clustering algorithm (MIMIC) <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, the extended compact GA (EcGA) <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and EDAs that use Bayesian and Gaussian networks <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B13">13</abbr><abbr bid="B17">17</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp> belong to the structural+parameter class.</p>
            <p>So as to have a graphical taxonomy of the subdivisions presented through this section, Table <tblr tid="T2">2</tblr> illustrates all the above features and models providing a graphical taxonomy of the subdivisions presented throughout this section. It also includes some useful tips to choose among the available EDAs, such as their pros and cons.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>EDAs taxonomy</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>Statistical order</p>
                     </c>
                     <c ca="left">
                        <p>Advantages</p>
                     </c>
                     <c ca="left">
                        <p>Disadvantages</p>
                     </c>
                     <c ca="left">
                        <p>Examples</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Univariate</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Simplest and fastest</p>
                     </c>
                     <c ca="left">
                        <p>Ignore feature dependencies</p>
                     </c>
                     <c ca="left">
                        <p>PBIL (Baluja, 1994)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Suited for high cardinality problems </p>
                     </c>
                     <c ca="left">
                        <p>Bad performance for deceptive problems</p>
                     </c>
                     <c ca="left">
                        <p>UMDA (M&#252;hlenbein and Paa&#223;, 1996) </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Scalable</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>cGA (Harik <it>et al</it>., 1999)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><b>Bivariate </b>(statistics of order two)</p>
                     </c>
                     <c ca="left">
                        <p>Able to represent low order dependencies</p>
                     </c>
                     <c ca="left">
                        <p>Possibly ignore some feature dependencies</p>
                     </c>
                     <c ca="left">
                        <p>MIMIC (De Bonet <it>et al</it>., 1996)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Suited for many problems</p>
                     </c>
                     <c ca="left">
                        <p>Slower than univariate EDAs</p>
                     </c>
                     <c ca="left">
                        <p>Dependency trees EDA (Baluja and Davies, 1997) BMDA (Pelikan and M&#252;hlenbein, 1999)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Graphically inquire the induced models</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Tree-EDA/Mixture of distributions EDA (Santana <it>et al</it>., 1999)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><b>Multivariate </b>(statistics of order greater than two)</p>
                     </c>
                     <c ca="left" cspan="3">
                        <p>Parameter learning (<it>only interaction model parameters</it>)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Suited for problems with known underlying model</p>
                     </c>
                     <c ca="left">
                        <p>Possibly ignore complex feature dependencies</p>
                     </c>
                     <c ca="left">
                        <p>FDA (M&#252;hlenbein <it>et al</it>., 1999)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Higher memory requirements than bivariate</p>
                     </c>
                     <c ca="left">
                        <p>Markov network-based EDA (Shakya and McCall, 2007)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left" cspan="3">
                        <p>Structure+parameter learning (<it>interaction model &amp; parameters of the model</it>)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Maximum power of generalization</p>
                     </c>
                     <c ca="left">
                        <p>Highest computation time</p>
                     </c>
                     <c ca="left">
                        <p>EcGA (Harik <it>et al</it>., 1999)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Flexibility to introduce user dependencies</p>
                     </c>
                     <c ca="left">
                        <p>Highest memory requirements</p>
                     </c>
                     <c ca="left">
                        <p>EBNA (Etxeberria and Larra&#241;aga, 1999)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Online study of the induced dependencies</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>BOA/hBOA (Pelikan <it>et al</it>., 1999, 2005)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Dependency networks EDA (G&#225;mez <it>et al</it>., 2007)</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>A taxonomy of some representative EDAs. We highlight a set of characteristics that can guide the choice of a particular EDA suited to the goals and properties of a given problem.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Potential of EDAs in bioinformatics</p>
            </st>
            <p>Evolutionary algorithms, and GAs in particular, have been widely and successfully applied in bioinformatics. It is reasonable to expect that the improvements in EDA efficiency and scalability can contribute to expanding the use of these algorithms, particularly for difficult problems where other evolutionary algorithms fail <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B27">27</abbr></abbrgrp>.</p>
            <p>There are other situations where the use of EDAs can be very useful for solving bioinformatics problems. For instance, probabilistic models used by EDAs can be set up a <it>priori </it>in such a way that they represent previous knowledge about the structure of the optimization problem. Even the use of incomplete or partial information about the problem domain can considerably reduce the computational cost of the search. Similarly, practitioners can manipulate the probabilistic models to favor solutions with certain pre-established partial configurations. This way they can test particular hypotheses about the configuration of the optimal solution.</p>
            <p>EDAs have another advantage, also associated with the capacity to model key features of the search space. The models generated during the search can be mined to reveal previously unknown information about the problem <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>.</p>
            <p>Furthermore, recent results of applying EDAs to problems from other domains <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> have shown that the information gathered by the models to solve a given problem instance can, in some cases, also be employed to solve other instances of the same problem. This paves the way for building bioinformatics applications where the information extracted from previous searches is reused to solve different instances of a similar problem.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>EDAs in genomics</p>
         </st>
         <sec>
            <st>
               <p>Introduction</p>
            </st>
            <p>Due to advances in modern high-throughput biotechnology devices, large and high-dimensional data sets are obtained from analyzed genomes and tissues. The heuristic scheme provided by EDAs has proved to be effective and efficient, in a variety of NP-hard genomic problems. Because of the huge cardinality of the solution spaces of most of these problems, researchers are aware of the need for an efficient optimization algorithm. In this way, authors have preferred simple EDA schemes that assume that the variables are independent. These schemes have obtained accurate and robust solutions in reasonable CPU times. Together with a brief definition of each tackled genomic problem, we describe the main characteristics of each EDA scheme are described, with a special emphasis on the codification used to represent the search individuals.</p>
         </sec>
         <sec>
            <st>
               <p>Gene structure analysis</p>
            </st>
            <p>As genomes are being sequenced at an increasing pace, the need for automatic procedures for annotating new genomes is becoming more and more important. A first and important step in the annotation of a new genome is the location of the genes in the genome, as well as their correct structure. As a gene may contain many different parts, the problem of gene structure prediction can be seen as a segmentation or parsing problem. To solve this problem automatically, pattern recognition and machine learning techniques are often used to build a model of what a gene looks like. This model can then be used to automatically locate potential genes in a genome <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>.</p>
            <p>A gene prediction framework consists of different components, where each component (often modeled as a classifier) aims at identifying a particular structural element of the gene. Important structural elements include the start of the gene (start codon), the end of a gene (stop codon) and the transitions between the coding and non-coding parts of the gene (splice sites).</p>
            <p>The exact mechanisms that the cell uses to recognize genes and their structural elements are still under research. As this knowledge is missing, one major problem in this context is to define adequate features to train the classifiers for each structural element. Consequently, large sets of sequence features are extracted in the hope that these sets will contain the key features. However, it is known that not all of these features will be important for the classification task at hand, and many will be irrelevant or redundant.</p>
            <p>To find the most relevant features for recognizing gene structural elements, feature subset selection (FSS) techniques can be used. These techniques try to select a subset of relevant features from the original set of features <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>. As this is an NP-hard optimization problem with 2<sup><it>n </it></sup>possible subsets for evaluation (given n features), population-based heuristic search methods are an interesting engine for driving the search through the space of possible feature subsets. Each solution in the population decodes a feature subset as a binary string: features having a value of 1 are included in the subset, whereas the ones having a value of 0 are discarded.</p>
            <p>As a natural alternative to genetic algorithms, the use of EDAs for FSS was initiated in <abbrgrp><abbr bid="B36">36</abbr></abbrgrp> for classic benchmark problems, and their use in large scale feature subset selection domains was reported to yield good results <abbrgrp><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr></abbrgrp>. Furthermore, the EDA-based approach to FSS was shown to generalize to feature weighting, ranking and selection <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. This has the advantage of getting more insight into the relevance of each feature separately, focusing on strongly relevant, weakly relevant, and irrelevant features.</p>
            <p>The application of EDA-based FSS techniques in gene structure prediction was pioneered for the most important gene prediction components in <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. Its most important application was the recognition of splice sites. Using na&#239;ve Bayes classifiers, support vector machines and C4.5 decision trees as base classifiers, an UMDA-based FSS scheme was used to obtain higher performance models.</p>
            <p>In addition to better models, an UMDA-based approach was also used to get more insight into the selected features. This led to both the identification of new characteristics, as well as the confirmation of important previously known characteristics <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Gene expression analysis</p>
            </st>
            <p>The quantitative and qualitative DNA analysis is one of the most important areas of modern biomedical research. DNA microarrays can simultaneously measure the expression level or activity level of thousands of genes under a set of conditions. Microarray technology has become a popular option for partial DNA analysis since Golub <it>et al</it>.'s pioneering work <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>.</p>
            <p>The starting point of this analysis is the so called gene expression matrix, where rows represent genes, columns represent experimental conditions (or samples), and the values at each position of the matrix characterize the expression level of the particular gene under the particular experimental condition. Additional biological information about the genes and the experimental conditions can be added to the matrix in the form of gene and/or sample annotation. Depending on how we treat the annotation, gene expression data analysis can be either supervised or unsupervised. When sample annotation is used to split the set of samples into two or more classes or phenotypes (e.g. 'healthy' or 'diseased' tissues), supervised analysis (or class prediction) tries to find patterns that are characteristic of each of the classes. On the other hand, unsupervised analysis (or class discovery) ignores any annotation. Examples of such analysis are gene clustering, sample clustering and gene expression data biclustering.</p>
            <p>The FSS paradigm has taken a leading role due to the challenge posed by the huge dimension of DNA microarray studies (datasets of close to 20,000 genes can be found in the experimental setups reported in recent literature), small sample sizes (gene expression studies with more than a hundred hybridizations are not common) and the notable influence of different sources of noise and variability. Thus, the application of dimensionality reduction techniques has become a must for any gene expression analysis.</p>
            <sec>
               <st>
                  <p>Classification of DNA microarray data</p>
               </st>
               <p>It is broadly assumed that a limited number of genes can cause the onset of a disease. Within this scenario biologists demand a reduction in the number of genes. In addition, the application of a FSS technique to microarray datasets is an essential step to achieve an accurate classification performance for any base classifier.</p>
               <p>Although univariate gene ranking procedures are very popular for differential gene expression detection, the multivariate selection of a subset of relevant and non-redundant genes has borrowed from the field of heuristic search engines to guide the exploration of the huge solution space (there are 2<sup><it>n </it></sup>possible gene subsets, where <it>n </it>is the number of initial genes). Two research groups have proven that the EDA paradigm is useful for this challenging problem. Both groups have implemented efficient algorithms that have achieved accuracy levels comparable to the most effective state-of-the-art optimization techniques:</p>
               <p>&#8226; Using a na&#239;ve Bayes network as the base classifier and the UMDA as the search algorithm, Blanco <it>et al</it>. <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> achieve competitive results in two gene expression benchmarking datasets. The authors show that the predictive power of the models can be improved when the probability of each gene being selected in the first population is initialized using the results provided by a set of simple sequential search procedures.</p>
               <p>&#8226; Paul and Iba <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr></abbrgrp> propose two variations of the PBIL search algorithm to identify subsets of relevant and non-redundant genes. Using a wide variety of classifiers, notable results are achieved in a set of gene expression benchmarking datasets with subsets of extremely low dimensionality.</p>
               <p>Using a continuous-value version of the UMDA procedure, EDAs have been used as a new way of regularizing the logistic regression model for microarray classification problems <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. Regularization consists of shrinking the parameter estimates to avoid their unstability present when there are a huge number of variables compared to a small number of observations (as in the microarray setting). Therefore, the parameter estimators are restricted maximum likelihood estimates, i.e. the maximum value of a new function including the likelihood function, plus a penalty term where the size of the estimators is constrained. There are different norms for measuring estimators size. This leads to different regularized logistic regression names <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>: ridge, Lasso, bridge, elastic net, etc.</p>
               <p>EDAs could be used to optimize these new functions and be a good optimization method especially in some cases where numerical methods are unable to solve the corresponding non-differentiable and non-convex optimization problems. However, another possibility, taken up in <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>, is to use EDAs to maximize the likelihood function without having to be penalized (which is a simpler optimization problem) and to include the shrinkage of the estimates during the simulation of the new population. New estimates are simulated during EDA evolutionary process in such a way that guarantees their shrinkage while maintaining their probabilistic dependence relationships learnt in the previous step. This procedure yields regularized estimates at the end of the process.</p>
            </sec>
            <sec>
               <st>
                  <p>Clustering of DNA microarray data</p>
               </st>
               <p>Whereas the above papers propose a supervised classification framework, clustering is one of the main tools used to analyze gene expression data obtained from microarray experiments <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. Grouping together genes with the same behaviour across samples, that is, gene clusters, can suggest new functions for all or some of the grouped genes. We highlight two papers that use EDAs in the context of gene expression profile clustering:</p>
               <p>&#8226; Pe&#241;a <it>et al</it>. <abbrgrp><abbr bid="B49">49</abbr></abbrgrp> present an application of EDAs for identifying clusters of genes with similar expression profiles across samples using unsupervised Bayesian networks. The technique is based on an UMDA procedure that works in conjunction with the EM clustering algorithm. To evaluate the proposed method, synthetic and real data are analyzed. The experimentation with both types of data provides clusters of genes that may be biologically meaningful and, thus, interesting for biologists to research further.</p>
               <p>&#8226; Cano <it>et al</it>. <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> use UMDA and genetic algorithms to look for clusters of genes with high variance across samples. A real microarray dataset is analyzed, and the Gene Ontology Term Finder is used to evaluate the biological meaning of the resulting clusters.</p>
               <p>Like clustering, biclustering is another NP-hard problem that was originally considered by Morgan and Sonquist in 1963 <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. Biclustering is founded on the fact that not all the genes of a given cluster should be grouped into the same conditions due to their varying biological activity. Thus, biclustering assumes that several genes will only change their expression levels within a specified subset of conditions <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. This assumption has motivated the development of specific algorithms for biclustering analysis.</p>
               <p>An example is the work by Palacios <it>et al</it>. <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>, which applies an UMDA scheme to search the possible bicluster space. They get accurate results compared to genetic algorithms when seeking single biclusters with coherent evolutions of gene expression values. Like the classic codification discussed for the FSS problem, the authors use two concatenated binary arrays to represent a bicluster, (<it>x</it><sub>1</sub>, ..., <it>x</it><sub><it>n </it></sub>| <it>y</it><sub>1</sub>, ..., <it>y</it><sub><it>m</it></sub>). The first array represents each gene of the microarray, where the size is the number of genes. The second array represents each condition, with a size equal to the number of conditions. A value of 1 in the <it>i</it><sup><it>th </it></sup>position of the first array shows that the <it>i</it><sup><it>th </it></sup>gene has been selected for inclusion in the bicluster. Likewise, a value of 1 in the <it>j</it><sup><it>th </it></sup>position of the second array indicates that the <it>j</it><sup><it>th </it></sup>condition has been selected for inclusion in the bicluster. This codification results in a space of 2<sup><it>n</it>+<it>m </it></sup>possible biclusters.</p>
            </sec>
            <sec>
               <st>
                  <p>Inference of genetic networks</p>
               </st>
               <p>The inference of gene-gene interactions from gene expression data is a powerful tool for understanding the system behaviour of living organisms <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>.</p>
               <p>This promising research area is now of much interest for biomedical practitioners, and a few papers have even applied EDAs to this domain. One of these early works uses Bayesian networks as the paradigm for modeling the interactions among genes, while an UMDA approach explores the search space to find the candidate interactions <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>. The subsequent literature evaluation of the most reliable interactions unveils that many of them have been previously reported in the literature.</p>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>EDAs in proteomics</p>
         </st>
         <sec>
            <st>
               <p>Introduction</p>
            </st>
            <p>The objective of protein structure prediction is to predict the native structure of a protein from its sequence. In protein design, the goal is to create new proteins that satisfy some given structural or functional constraints. Frequently, both problems are addressed using function optimization. As the possible solution space is usually huge, complex and contains many local optima, heuristic optimization methods are needed. The efficiency of the optimization algorithm plays a crucial role in the process. In this section, we review applications of EDAs to different variants of protein structure prediction and protein design problems.</p>
            <p>We start by reviewing some important concepts related to protein models and energy functions in optimization. Then, we propose an initial general classification of EDA applications to protein problems according to how sophisticated and detailed the protein models used are. Subsequently, we give a more detailed classification based on the specificities of the protein problems.</p>
         </sec>
         <sec>
            <st>
               <p>Protein structure prediction and protein design</p>
            </st>
            <p>Protein structure prediction and protein design are usually addressed by minimizing an energy function in the candidate solution space. Two essential issues in the application of EDAs and other optimization algorithms to these problems are the type of protein representation employed and the energy function of choice.</p>
            <p>There are many factors that influence the stability of proteins and have to be taken into account to evaluate candidate structures. The native state is thought to be at the global free energy minimum of the protein. Electrostatic interactions, including hydrogen bonds, van der Waals interactions, intrinsic propensities of the amino acids to take up certain structures, hydrophobic interactions and conformational entropy contribute to free energy. Determining to what extent the function can represent all of these factors, as well as how to weight each one are difficult questions that have to be solved before applying the optimization method.</p>
            <p>Simplified protein models omit some of these factors and are a first problem-solving approximation. For example, the approximate fold of a protein is influenced by the sequence of hydrophobic and hydrophilic residues, irrespective of what the actual amino acids in that sequence are <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. Therefore, a first approximation could simply be constructed by a binary patterning of hydrophobic and hydrophilic residues to match the periodicity of secondary structural elements. Simplification can be further developed to consider proteins represented using this binary patterning and to approximate the protein structure prediction problem as two- and three-dimensional lattices. In this case, the energy function measures only hydrophobic and hydrophilic interactions. An example of this type of representation is shown in Figure <figr fid="F3">3</figr>, where a sequence of 64 aminoacids is represented on a two-dimensional lattice.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Optimal protein structure (Figure 3-ProteinStructure.eps)</p>
               </caption>
               <text>
                  <p><b>Optimal protein structure (Figure 3-ProteinStructure.eps)</b>. Optimal solution of an HP model found by an EDA that uses a Markovian model.</p>
               </text>
               <graphic file="1756-0381-1-6-3"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>EDA approaches</p>
            </st>
            <p>Depending on how sophisticated and detailed the protein model used is, EDAs can be divided into two groups: EDAs applying a simplified model <abbrgrp><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr></abbrgrp> and EDAs using more detailed (atomic-based) models <abbrgrp><abbr bid="B61">61</abbr><abbr bid="B62">62</abbr><abbr bid="B63">63</abbr></abbrgrp>. A more thorough classification is related to the type of problems addressed:</p>
            <p>&#8226; Protein structure prediction in simplified models <abbrgrp><abbr bid="B58">58</abbr><abbr bid="B60">60</abbr></abbrgrp>.</p>
            <p>&#8226; Protein side chain placement <abbrgrp><abbr bid="B62">62</abbr><abbr bid="B63">63</abbr></abbrgrp>.</p>
            <p>&#8226; Design of protein peptide ligands <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>.</p>
            <p>&#8226; Protein design by minimization of contact potentials <abbrgrp><abbr bid="B59">59</abbr><abbr bid="B64">64</abbr></abbrgrp>.</p>
            <p>&#8226; Aminoacid alphabet reduction for protein structure prediction <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>.</p>
            <p>&#8226; Using EDAs as a simulation tool to investigate the influence of different protein features in the protein folding process <abbrgrp><abbr bid="B63">63</abbr></abbrgrp>.</p>
            <p>In <abbrgrp><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr></abbrgrp>, EDAs are used to solve bi-dimensional and three-dimensional simplified protein folding problems. The hydrophobic-polar (HP) <abbrgrp><abbr bid="B65">65</abbr></abbrgrp>, and functional protein models <abbrgrp><abbr bid="B66">66</abbr></abbrgrp> are optimized using EDAs based on probabilistic models of different complexity (i.e. Tree-EDA <abbrgrp><abbr bid="B67">67</abbr></abbrgrp>, mixtures of trees EDA (MT-EDA) <abbrgrp><abbr bid="B67">67</abbr></abbrgrp> and EDAs that use <it>k</it>-order Markov models (MK-EDA<sub><it>k</it></sub>) <abbrgrp><abbr bid="B58">58</abbr></abbrgrp>).</p>
            <p>The results achieved outperform other evolutionary algorithms. For example, the configuration shown in Figure <figr fid="F3">3</figr> is the optimal solution found by MK-EDA<sub>2</sub>. Due to the particular topology of this instance, other evolutionary algorithms consistently fail to find the optimal solution <abbrgrp><abbr bid="B58">58</abbr></abbrgrp>.</p>
            <p>Side chain placement problems are dealt with using UMDA with discrete representation in <abbrgrp><abbr bid="B62">62</abbr><abbr bid="B63">63</abbr></abbrgrp>. The approach is based on the use of rotamer libraries that can represent the side chain configurations using their rotamer angles. For these problems, EDAs have achieved very good results in situations where other methods fail <abbrgrp><abbr bid="B63">63</abbr></abbrgrp>. Results are better when EDAs are combined with local optimization methods as in <abbrgrp><abbr bid="B63">63</abbr></abbrgrp>, where variable neighborhood search <abbrgrp><abbr bid="B68">68</abbr></abbrgrp> is applied to the best solutions found by UMDA.</p>
            <p>Belda <it>et al</it>. <abbrgrp><abbr bid="B61">61</abbr></abbrgrp> use different EDAs to generate potential peptide ligands of a given protein by minimizing the docking energy between the candidate peptide ligand and a user-defined area of the target protein surface. The results of the population based incremental learning algorithm (PBIL) <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> and the Bayesian optimization algorithm (BOA) <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> are compared with two different types of genetic algorithms. Results showed that some of the ligands designed using the computational methods had better docking energies than peptides designed using a purely chemical knowledge-based approach <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>.</p>
            <p>In <abbrgrp><abbr bid="B64">64</abbr></abbrgrp>, three different EDAs are applied to solve a protein design problem by minimizing contact potentials: UMDA, Tree-EDA and Tree-EDA<sup><it>r </it></sup>(the structure of the tree is deduced from the known protein structure, tree parameters are learned from data). Combining probabilistic models able to represent probabilistic dependencies with information about residue interactions in the protein contact graph is shown to improve the search efficiency for the evaluated problems. In <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>, EDAs that use loopy probabilistic models are combined with inference-based optimization algorithms to deal with the same problems. For several protein instances, this approach manage to improve the results obtained with tree-based EDAs.</p>
            <p>The alphabet reduction problem is addressed in <abbrgrp><abbr bid="B57">57</abbr></abbrgrp> using the extended compact genetic algorithm (EcGA) <abbrgrp><abbr bid="B69">69</abbr></abbrgrp>. The problem is to reduce the 20-letter amino acid (AA) alphabet into a lower cardinality alphabet. A genetics-based machine learning technique uses the reduced alphabet to induce rules for protein structure prediction features. The results showed that it is possible to reduce the size of the alphabet used for prediction from twenty to just three letters resulting in more compact rules.</p>
            <p>Results of using EDAs and the HP model to simulate the protein folding process are presented in <abbrgrp><abbr bid="B64">64</abbr></abbrgrp>. Some of the features exhibited by the EDA model that mimics the behaviour of the protein folding process are investigated. The features considered include the correlation between the EDA success rate and the contact order of the protein models, and the relationship between the generation convergence of EDAs for the HP model and the contact order of the optimal solution. Other issues analyzed are the differences in the rate of formation of native contacts during EDA evolution, and how these differences are associated with the contact separation of the protein instance.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Throughout this paper, we reviewed the state-of-the-art of EDA applications in bioinformatics. As soon as researchers realized the need to apply a randomized, population-based, heuristic search, EDAs emerged as a natural alternative to commonly used genetic algorithms. Since the possible solution space is huge for most of the addressed problems, researchers have made use of efficient EDA implementations.</p>
         <p>A group of interesting papers demonstrate the efficiency and the competitive accuracy of this novel search paradigm in a set of challenging NP-hard genomic and proteomic bioinformatic tasks. As the number of EDA application papers in bioinformatics is modest and the number and variety of problems is constantly growing, there is room for new EDA applications in the field.</p>
         <p>An interesting opportunity for future research is the adaptation and application of multivariate EDA models that can efficiently deal with the huge dimensionality of current bioinformatic problems. Going further than simple univariate models, bio-experts could explicitly inspect the probabilistic relationships among problem variables for each generation of the evolutionary process. This would create opportunities for improved accuracy. These probabilistic relationships induced from the evolutionary model are an attractive way of proposing novel biological hypotheses to be further tested by bio-experts.</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The authors declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>RA, II, and PL conceived of the manuscript. II, YS, JLF, RB, VR and CB participated in writting the genomics section. The proteomics section was designed and written by RS and JAL. The introduction to EDAs was carried out by RA, RS and YS. RA was in charge of the writing and coordination process. II, YVP and PL helped to write and correct the manuscript draft. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work has been partially supported by the 2007&#8211;2012 Etortek, Saiotek and Research Group (IT-242-07) programs (Basque Government), TIN2005-03824 and Consolider Ingenio 2010-CSD2007-00018 projects (Spanish Ministry of Education and Science) and the COMBIOMED network in computational biomedicine (Carlos III Health Institute).</p>
            <p>R. Arma&#241;anzas is supported by Basque Government grant AE-BFI-05/430. Y. Saeys would like to thank the Fund for Scientific Research Flanders (FWO) for funding his research.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Linkage information processing in distribution estimation algorithms</p>
            </title>
            <aug>
               <au>
                  <snm>Bosman</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Thierens</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proceedings of the Genetic and Evolutionary Computation Conference GECCO-1999</source>
            <publisher>Orlando, FL: Morgan Kaufmann Publishers, San Francisco, CA</publisher>
            <editor>Banzhaf W, Daida J, Eiben AE, Garzon MH, Honavar V, Jakiela M, Smith RE</editor>
            <pubdate>1999</pubdate>
            <volume>I</volume>
            <fpage>60</fpage>
            <lpage>67</lpage>
         </bibl>
         <bibl id="B2">
            <aug>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lozano</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <cnm>Eds</cnm>
               </au>
            </aug>
            <source>Estimation of Distribution Algorithms. A New Tool for Evolutionary Computation</source>
            <publisher>Kluwer Academic Publishers</publisher>
            <pubdate>2002</pubdate>
         </bibl>
         <bibl id="B3">
            <aug>
               <au>
                  <snm>Lozano</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Inza</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Bengoetxea</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <cnm>Eds</cnm>
               </au>
            </aug>
            <source>Towards a New Evolutionary Computation: Advances on Estimation of Distribution Algorithms</source>
            <publisher>Springer-Verlag</publisher>
            <pubdate>2006</pubdate>
         </bibl>
         <bibl id="B4">
            <title>
               <p>From recombination of genes to the estimation of distributions. Binary parameters</p>
            </title>
            <aug>
               <au>
                  <snm>M&#252;hlenbein</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Paa&#223;</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Lecture Notes in Computer Science 1411: Parallel Problem Solving from Nature, PPSN IV</source>
            <pubdate>1996</pubdate>
            <fpage>178</fpage>
            <lpage>187</lpage>
         </bibl>
         <bibl id="B5">
            <aug>
               <au>
                  <snm>Pelikan</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Hierarchical Bayesian Optimization Algorithm. Toward a New Generation of Evolutionary Algorithms, of Studies in Fuzziness and Soft Computing</source>
            <publisher>Springer</publisher>
            <pubdate>2005</pubdate>
            <volume>170</volume>
         </bibl>
         <bibl id="B6">
            <aug>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Estimation of Distribution Algorithms</source>
            <publisher>A New Tool for Evolutionary Computation, Kluwer Academic Publishers 2002 chap. A review on estimation of distribution algorithms</publisher>
            <fpage>55</fpage>
            <lpage>98</lpage>
         </bibl>
         <bibl id="B7">
            <aug>
               <au>
                  <snm>Goldberg</snm>
                  <fnm>DE</fnm>
               </au>
            </aug>
            <source>The Design of Innovation: Lessons from and for Competent Genetic Algorithms</source>
            <publisher>Kluwer Academic</publisher>
            <pubdate>2002</pubdate>
         </bibl>
         <bibl id="B8">
            <title>
               <p>A survey of optimization by building and using probabilistic models</p>
            </title>
            <aug>
               <au>
                  <snm>Pelikan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Goldberg</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Lobo</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Computational Optimization and Applications</source>
            <pubdate>2002</pubdate>
            <volume>21</volume>
            <fpage>5</fpage>
            <lpage>20</lpage>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Population-based incremental learning: A method for integrating genetic search based function optimization and competitive learning</p>
            </title>
            <aug>
               <au>
                  <snm>Baluja</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <publisher>Tech Rep CMU-CS-94&#8211;163, Carnegie Mellon University, Pittsburgh, PA</publisher>
            <pubdate>1994</pubdate>
         </bibl>
         <bibl id="B10">
            <title>
               <p>The compact genetic algorithm</p>
            </title>
            <aug>
               <au>
                  <snm>Harik</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Lobo</snm>
                  <fnm>FG</fnm>
               </au>
               <au>
                  <snm>Goldberg</snm>
                  <fnm>DE</fnm>
               </au>
            </aug>
            <source>IEEE Transactions on Evolutionary Computation</source>
            <pubdate>1999</pubdate>
            <volume>3</volume>
            <issue>4</issue>
            <fpage>287</fpage>
            <lpage>297</lpage>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Extending population-based incremental learning to continuous search spaces</p>
            </title>
            <aug>
               <au>
                  <snm>Sebag</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ducoulombier</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Parallel Problem Solving from Nature &#8211; PPSN V</source>
            <pubdate>1998</pubdate>
            <fpage>418</fpage>
            <lpage>427</lpage>
         </bibl>
         <bibl id="B12">
            <title>
               <p>MIMIC: Finding optima by estimating probability densities</p>
            </title>
            <aug>
               <au>
                  <snm>De Bonet</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Isbell</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Viola</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Advances in Neural Information Processing Systems</source>
            <publisher>The MIT Press</publisher>
            <editor>Mozer MC, Jordan MI, Petsche T</editor>
            <pubdate>1997</pubdate>
            <volume>9</volume>
            <fpage>424</fpage>
            <lpage>430</lpage>
         </bibl>
         <bibl id="B13">
            <title>
               <p>The bivariate marginal distribution algorithm</p>
            </title>
            <aug>
               <au>
                  <snm>Pelikan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>M&#252;hlenbein</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Advances in Soft Computing &#8211; Engineering Design and Manufacturing</source>
            <publisher>London: Springer-Verlag</publisher>
            <editor>Roy R, Furuhashi T, Chawdhry PK</editor>
            <pubdate>1999</pubdate>
            <fpage>521</fpage>
            <lpage>535</lpage>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Using optimal dependency-trees for combinatorial optimization: Learning the structure of the search space</p>
            </title>
            <aug>
               <au>
                  <snm>Baluja</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Davies</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Proceedings of the 14th International Conference on Machine Learning</source>
            <pubdate>1997</pubdate>
            <fpage>30</fpage>
            <lpage>38</lpage>
         </bibl>
         <bibl id="B15">
            <title>
               <p>The edge incident model</p>
            </title>
            <aug>
               <au>
                  <snm>Santana</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ponce de Le&#243;n</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Ochoa</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proceedings of the Second Symposium on Artificial Intelligence (CIMAF-99)</source>
            <pubdate>1999</pubdate>
            <fpage>352</fpage>
            <lpage>359</lpage>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Schemata, distributions and graphical models in evolutionary optimization</p>
            </title>
            <aug>
               <au>
                  <snm>M&#252;hlenbein</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Mahnig</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ochoa</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Journal of Heuristics</source>
            <pubdate>1999</pubdate>
            <volume>5</volume>
            <issue>2</issue>
            <fpage>213</fpage>
            <lpage>247</lpage>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Global optimization using Bayesian networks</p>
            </title>
            <aug>
               <au>
                  <snm>Etxeberria</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Proceedings of the Second Symposium on Artificial Intelligence (CIMAF-99)</source>
            <pubdate>1999</pubdate>
            <fpage>151</fpage>
            <lpage>173</lpage>
         </bibl>
         <bibl id="B18">
            <title>
               <p>BOA: The Bayesian optimization algorithm</p>
            </title>
            <aug>
               <au>
                  <snm>Pelikan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Goldberg</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Cant&#250;-Paz</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Evol Comput</source>
            <pubdate>2000</pubdate>
            <volume>8</volume>
            <issue>3</issue>
            <fpage>311</fpage>
            <lpage>340</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11001554</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>MARLEDA: Effective Distribution Estimation Through Markov Random Fields</p>
            </title>
            <aug>
               <au>
                  <snm>Alden</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>PhD thesis</source>
            <publisher>Faculty of the Graduate Schoool, University of Texas at Austin, USA</publisher>
            <pubdate>2007</pubdate>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Optimization by estimation of distribution with DEUM framework based on Markov random fields</p>
            </title>
            <aug>
               <au>
                  <snm>Shakya</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>McCall</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>International Journal of Automation and Computing</source>
            <pubdate>2007</pubdate>
            <volume>4</volume>
            <issue>3</issue>
            <fpage>262</fpage>
            <lpage>272</lpage>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Estimation of distribution algorithms with Kikuchi approximations</p>
            </title>
            <aug>
               <au>
                  <snm>Santana</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Evolutionary Computation</source>
            <pubdate>2005</pubdate>
            <volume>13</volume>
            <fpage>67</fpage>
            <lpage>97</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15901427</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>EDNA: Estimation of dependency networks algorithm</p>
            </title>
            <aug>
               <au>
                  <snm>G&#225;mez</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Mateo</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Puerta</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Bio-inspired Modeling of Cognitive Tasks, Second International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC, of Lecture Notes in Computer Science</source>
            <editor>Mira J, Alvarez JR</editor>
            <pubdate>2007</pubdate>
            <volume>4527</volume>
            <fpage>427</fpage>
            <lpage>436</lpage>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Stochastic relaxation, Gibbs distributions, and Bayesian restoration of images</p>
            </title>
            <aug>
               <au>
                  <snm>Geman</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Geman</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
            <pubdate>1984</pubdate>
            <issue>6</issue>
            <fpage>721</fpage>
            <lpage>741</lpage>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Evolutionary synthesis of Bayesian networks for optimization</p>
            </title>
            <aug>
               <au>
                  <snm>M&#252;hlenbein</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Mahnig</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Advances in Evolutionary Synthesis of Intelligent Agents</source>
            <publisher>MIT Press</publisher>
            <editor>Patel M, Honavar V, Balakrishnan K</editor>
            <pubdate>2001</pubdate>
            <fpage>429</fpage>
            <lpage>455</lpage>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Factorized distribution algorithms using Bayesian networks bounded complexity</p>
            </title>
            <aug>
               <au>
                  <snm>Ochoa</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>M&#252;hlenbein</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Soto</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proceedings of the Genetic and Evolutionary Computation Conference GECCO-2000</source>
            <pubdate>2000</pubdate>
            <fpage>212</fpage>
            <lpage>215</lpage>
         </bibl>
         <bibl id="B26">
            <title>
               <p>A factorized distribution algorithm using single connected Bayesian networks</p>
            </title>
            <aug>
               <au>
                  <snm>Ochoa</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>M&#252;hlenbein</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Soto</snm>
                  <fnm>MR</fnm>
               </au>
            </aug>
            <source>Parallel Problem Solving from Nature &#8211; PPSN VI 6th International Conference</source>
            <publisher>Springer Verlag</publisher>
            <editor>Schoenauer M, Deb K, Rudolph G, Yao X, Lutton E, Merelo JJ, Schwefel H</editor>
            <pubdate>2000</pubdate>
            <fpage>787</fpage>
            <lpage>796</lpage>
         </bibl>
         <bibl id="B27">
            <aug>
               <au>
                  <snm>Pelikan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sastry</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Cant&#250;-Paz</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <cnm>Eds</cnm>
               </au>
            </aug>
            <source>Scalable Optimization via Probabilistic Modeling: From Algorithms to Applications</source>
            <publisher>Studies in Computational Intelligence, Springer</publisher>
            <pubdate>2006</pubdate>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Inexact Graph Matching Using Estimation of Distribution Algorithms</p>
            </title>
            <aug>
               <au>
                  <snm>Bengoetxea</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>PhD thesis</source>
            <publisher>Ecole Nationale Sup&#233;rieure des T&#233;l&#233;communications</publisher>
            <pubdate>2003</pubdate>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Analyzing probabilistic models in hierarchical BOA on traps and spin glasses</p>
            </title>
            <aug>
               <au>
                  <snm>Hauschild</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pelikan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lima</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Sastry</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Proceedings of the Genetic and Evolutionary Computation Conference GECCO-2007</source>
            <pubdate>2007</pubdate>
            <volume>I</volume>
            <fpage>523</fpage>
            <lpage>530</lpage>
         </bibl>
         <bibl id="B30">
            <aug>
               <au>
                  <snm>Echegoyen</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Santana</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lozano</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Linkage in evolutionary computation</source>
            <publisher>Studies in Computational Intelligence 2008 chap. The impact of probabilistic learning algorithms in EDAs based on Bayesian networks</publisher>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Using previous models to bias structural learning in the hierarchical BOA</p>
            </title>
            <aug>
               <au>
                  <snm>Hauschild</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pelikan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sastry</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Goldberg</snm>
                  <fnm>DE</fnm>
               </au>
            </aug>
            <publisher>MEDAL Report No. 2008003, Missouri Estimation of Distribution Algorithms Laboratory (MEDAL)</publisher>
            <pubdate>2008</pubdate>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Current methods of gene prediction, their strengths and weaknesses</p>
            </title>
            <aug>
               <au>
                  <snm>Math&#233;</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Sagot</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Schiex</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Rouz&#233;</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Research</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <issue>19</issue>
            <fpage>4103</fpage>
            <lpage>4117</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">140543</pubid>
                  <pubid idtype="pmpid" link="fulltext">12364589</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <aug>
               <au>
                  <snm>Majoros</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Methods for Computational Gene Prediction</source>
            <publisher>Cambridge University Press</publisher>
            <pubdate>2007</pubdate>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Toward integrating feature selection algorithms for classification and clustering</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>IEEE Transactions on Knowledge and Data Engineering</source>
            <pubdate>2005</pubdate>
            <volume>17</volume>
            <issue>4</issue>
            <fpage>491</fpage>
            <lpage>502</lpage>
         </bibl>
         <bibl id="B35">
            <title>
               <p>A review of feature selection techniques in bioinformatics</p>
            </title>
            <aug>
               <au>
                  <snm>Saeys</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Inza</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>23</volume>
            <issue>19</issue>
            <fpage>2507</fpage>
            <lpage>2517</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17720704</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Feature subset selection by Bayesian networks based optimization</p>
            </title>
            <aug>
               <au>
                  <snm>Inza</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Etxebarria</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sierra</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Artificial Intelligence</source>
            <pubdate>1999</pubdate>
            <volume>27</volume>
            <fpage>143</fpage>
            <lpage>164</lpage>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Feature subset selection by genetic algorithms and estimation of distribution algorithms &#8211; A case study in the survival of cirrhotic patients treated with TIPS</p>
            </title>
            <aug>
               <au>
                  <snm>Inza</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Merino</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Quiroga</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Sierra</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Girala</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Artificial Intelligence in Medicine</source>
            <pubdate>2001</pubdate>
            <volume>23</volume>
            <issue>2</issue>
            <fpage>187</fpage>
            <lpage>205</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11583925</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Fast feature selection using a simple estimation of distribution algorithm: A case study on splice site prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Saeys</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Degroeve</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Aeyels</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Peer</snm>
                  <mnm>Van de</mnm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Rouz&#233;</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>Suppl 2</issue>
            <fpage>179</fpage>
            <lpage>188</lpage>
         </bibl>
         <bibl id="B39">
            <aug>
               <au>
                  <snm>Saeys</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Degroeve</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Peer</snm>
                  <mnm>Van de</mnm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Towards a New Evolutionary Computation: Advances in Estimation of Distribution Algorithms</source>
            <publisher>Springer 2006 chap. Feature ranking using an EDA-based wrapper approach</publisher>
            <fpage>243</fpage>
            <lpage>257</lpage>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Feature Selection for Classification of Nucleic Acid Sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Saeys</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>PhD thesis</source>
            <publisher>Ghent University, Belgium</publisher>
            <pubdate>2004</pubdate>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Feature selection for splice site prediction: A new method using EDA-based feature ranking</p>
            </title>
            <aug>
               <au>
                  <snm>Saeys</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Degroeve</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Aeyels</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Rouz&#233;</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Peer</snm>
                  <mnm>Van de</mnm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>64</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">421631</pubid>
                  <pubid idtype="pmpid" link="fulltext">15154966</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring</p>
            </title>
            <aug>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Slonim</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Tamayo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Huard</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Gaasenbeek</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Coller</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Loh</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Downing</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Caliguri</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Bloomfield</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>286</volume>
            <fpage>531</fpage>
            <lpage>537</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10521349</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Gene selection for cancer classification using wrapper approaches</p>
            </title>
            <aug>
               <au>
                  <snm>Blanco</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Inza</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Sierra</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>International Journal of Pattern Recognition and Artificial Intelligence</source>
            <pubdate>2004</pubdate>
            <volume>18</volume>
            <issue>8</issue>
            <fpage>1373</fpage>
            <lpage>1390</lpage>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Identification of informative genes for molecular classification using probabilistic model building genetic algorithms</p>
            </title>
            <aug>
               <au>
                  <snm>Paul</snm>
                  <fnm>TK</fnm>
               </au>
               <au>
                  <snm>Iba</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Proceedings of the Genetic and Evolutionary Computation Conference GECCO-2004. Lecture Notes in Computer Science 3102</source>
            <pubdate>2004</pubdate>
            <fpage>414</fpage>
            <lpage>425</lpage>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Gene selection for classification of cancers using probabilistic model building genetic algorithm</p>
            </title>
            <aug>
               <au>
                  <snm>Paul</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Iba</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>BioSystems</source>
            <pubdate>2005</pubdate>
            <volume>82</volume>
            <issue>3</issue>
            <fpage>208</fpage>
            <lpage>225</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16112804</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Estimation of distribution algorithms as logistic regression regularizers of microarray classifiers</p>
            </title>
            <aug>
               <au>
                  <snm>Bielza</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Robles</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Methods of Information in Medicine</source>
            <pubdate>2008</pubdate>
            <inpress/>
         </bibl>
         <bibl id="B47">
            <aug>
               <au>
                  <snm>Hastie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>The Elements of Statistical Learning: Data Mining, Inference, and Prediction</source>
            <publisher>Springer-Verlag</publisher>
            <pubdate>2001</pubdate>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Clustering gene expression patterns</p>
            </title>
            <aug>
               <au>
                  <snm>Ben-Dor</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Shamir</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Yakhini</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Journal of Computational Biology</source>
            <pubdate>1999</pubdate>
            <volume>6</volume>
            <issue>3/4</issue>
            <fpage>281</fpage>
            <lpage>297</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10582567</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Unsupervised learning of Bayesian networks via estimation of distribution algorithms: an application to gene expression data clustering</p>
            </title>
            <aug>
               <au>
                  <snm>Pe&#241;a</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lozano</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems</source>
            <pubdate>2004</pubdate>
            <volume>12</volume>
            <fpage>63</fpage>
            <lpage>82</lpage>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Evolutionary algorithms for finding interpretable patterns in gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Cano</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Blanco</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Garc&#237;a</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>L&#243;pez</snm>
                  <fnm>FJ</fnm>
               </au>
            </aug>
            <source>International Journal on Computer Science and Information System</source>
            <pubdate>2006</pubdate>
            <volume>1</volume>
            <issue>2</issue>
            <fpage>88</fpage>
            <lpage>99</lpage>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Problems in the analysis of survey data, and a proposal</p>
            </title>
            <aug>
               <au>
                  <snm>Morgan</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Sonquistz</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Journal of the American Statistical Association</source>
            <pubdate>1963</pubdate>
            <volume>58</volume>
            <fpage>415</fpage>
            <lpage>434</lpage>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Biclustering of expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Cheng</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology</source>
            <publisher>AAAI Press</publisher>
            <pubdate>2000</pubdate>
            <fpage>93</fpage>
            <lpage>103</lpage>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Obtaining biclusters in microarrays with population-based heuristics</p>
            </title>
            <aug>
               <au>
                  <snm>Palacios</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Pelta</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Blanco</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Evo Workshops, Springer</source>
            <volume>2006</volume>
            <fpage>115</fpage>
            <lpage>126</lpage>
         </bibl>
         <bibl id="B54">
            <title>
               <p>Detecting reliable gene interactions by a hierarchy of Bayesian network classifiers</p>
            </title>
            <aug>
               <au>
                  <snm>Arma&#241;anzas</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Inza</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source> Comput Methods Programs Biomed</source>
            <pubdate>2008</pubdate>
            <volume>91</volume>
            <issue>2</issue>
            <fpage>110</fpage>
            <lpage>121</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">18433926</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Inducing pairwise gene interactions from time series data by EDA based Bayesian network</p>
            </title>
            <aug>
               <au>
                  <snm>Dai</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Conf Proc IEEE Eng Med Biol Soc</source>
            <pubdate>2005</pubdate>
            <volume>7</volume>
            <fpage>7746</fpage>
            <lpage>7749</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17282077</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Protein design concepts</p>
            </title>
            <aug>
               <au>
                  <snm>Steipe</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>The Encyclopedia of Computational Chemistry</source>
            <publisher>Chichester: John Wiley &amp; Sons</publisher>
            <editor>Schleyer PVR, Allinger NL, Clark T, Gasteiger J, Kollman PA, Schaefer III HF, Schreiner PR</editor>
            <pubdate>1998</pubdate>
            <fpage>2168</fpage>
            <lpage>2185</lpage>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Automated alphabet reduction method with evolutionary algorithms for protein structure prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Bacardit</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Stout</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hirst</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Sastry</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Llor&#224;</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Krasnogor</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Proceedings of the Genetic and Evolutionary Computation Conference GECCO-2007</source>
            <pubdate>2007</pubdate>
            <volume>I</volume>
            <fpage>346</fpage>
            <lpage>353</lpage>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Protein folding in 2-dimensional lattices with estimation of distribution algorithms</p>
            </title>
            <aug>
               <au>
                  <snm>Santana</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lozano</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Proceedings of the First International Symposium on Biological and Medical Data Analysis, of Lecture Notes in Computer Science</source>
            <publisher>Barcelona: Springer Verlag</publisher>
            <pubdate>2004</pubdate>
            <volume>3337</volume>
            <fpage>388</fpage>
            <lpage>398</lpage>
         </bibl>
         <bibl id="B59">
            <title>
               <p>Advances in Probabilistic Graphical Models for Optimization and Learning Applications in Protein Modelling</p>
            </title>
            <aug>
               <au>
                  <snm>Santana</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>PhD thesis</source>
            <publisher>University of the Basque Country</publisher>
            <pubdate>2006</pubdate>
         </bibl>
         <bibl id="B60">
            <title>
               <p>Protein folding in simplified models with estimation of distribution algorithms</p>
            </title>
            <aug>
               <au>
                  <snm>Santana</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lozano</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>IEEE Transactions on Evolutionary Computation</source>
            <pubdate>2008</pubdate>
            <volume>12</volume>
            <issue>4</issue>
            <fpage>418</fpage>
            <lpage>438</lpage>
         </bibl>
         <bibl id="B61">
            <title>
               <p>ENPDA: An evolutionary structure-based de novo peptide design algorithm</p>
            </title>
            <aug>
               <au>
                  <snm>Belda</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Madurga</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Llor&#225;</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Martinell</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tarrag&#243;</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Piqueras</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nicol&#225;s</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Giralt</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Journal of Computer-Aided Molecular Design</source>
            <pubdate>2005</pubdate>
            <volume>19</volume>
            <issue>8</issue>
            <fpage>585</fpage>
            <lpage>601</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16267689</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Side chain placement using estimation of distribution algorithms</p>
            </title>
            <aug>
               <au>
                  <snm>Santana</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lozano</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Artificial Intelligence in Medicine</source>
            <pubdate>2007</pubdate>
            <volume>39</volume>
            <fpage>49</fpage>
            <lpage>63</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16854574</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>Combining variable neighborhood search and estimation of distribution algorithms in the protein side chain placement problem</p>
            </title>
            <aug>
               <au>
                  <snm>Santana</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lozano</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Journal of Heuristics</source>
            <pubdate>2007</pubdate>
            <inpress/>
         </bibl>
         <bibl id="B64">
            <title>
               <p>The role of a priori information in the minimization of contact potentials by means of estimation of distribution algorithms</p>
            </title>
            <aug>
               <au>
                  <snm>Santana</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Larra&#241;aga</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lozano</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Proceedings of the Fifth European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, of Lecture Notes in Computer Science</source>
            <editor>Marchiori E, Moore JH, Rajapakse JC</editor>
            <pubdate>2007</pubdate>
            <volume>4447</volume>
            <fpage>247</fpage>
            <lpage>257</lpage>
         </bibl>
         <bibl id="B65">
            <title>
               <p>Theory for the folding and stability of globular proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Dill</snm>
                  <fnm>KA</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1985</pubdate>
            <volume>24</volume>
            <issue>6</issue>
            <fpage>1501</fpage>
            <lpage>1509</lpage>
            <xrefbib>
               <pubid idtype="pmpid">3986190</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>The evolutionary landscape of functional model proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Hirst</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>Protein Engineering</source>
            <pubdate>1999</pubdate>
            <volume>12</volume>
            <fpage>721</fpage>
            <lpage>726</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10506281</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>The mixture of trees factorized distribution algorithm</p>
            </title>
            <aug>
               <au>
                  <snm>Santana</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ochoa</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Soto</snm>
                  <fnm>MR</fnm>
               </au>
            </aug>
            <source>Proceedings of the Genetic and Evolutionary Computation Conference GECCO-2001</source>
            <publisher>San Francisco, CA: Morgan Kaufmann Publishers</publisher>
            <editor>Spector L, Goodman E, Wu A, Langdon W, Voigt H, Gen M, Sen S, Dorigo M, Pezeshk S, Garzon M, Burke E</editor>
            <pubdate>2001</pubdate>
            <fpage>543</fpage>
            <lpage>550</lpage>
         </bibl>
         <bibl id="B68">
            <title>
               <p>A variable neighborhood algorithm &#8211; a new metaheuristics for combinatorial optimization</p>
            </title>
            <aug>
               <au>
                  <snm>Mladenovi&#263;</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Abstracts of Papers Presented at Optimization Days. Montr&#233;al</source>
            <pubdate>1995</pubdate>
            <fpage>112</fpage>
         </bibl>
         <bibl id="B69">
            <title>
               <p>Linkage learning via probabilistic modeling in the EcGA</p>
            </title>
            <aug>
               <au>
                  <snm>Harik</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Lobo</snm>
                  <fnm>FG</fnm>
               </au>
               <au>
                  <snm>Sastry</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Scalable Optimization via Probabilistic Modeling: From Algorithms to Applications, Studies in Computational Intelligence</source>
            <publisher>Springer-Verlag</publisher>
            <editor>Pelikan M, Sastry K, Cant&#250;-Paz E</editor>
            <pubdate>2006</pubdate>
            <fpage>39</fpage>
            <lpage>62</lpage>
         </bibl>
      </refgrp>
   </bm>
</art>

