<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1756-0381-1-4</ui>
   <ji>1756-0381</ji>
   <fm>
      <dochead>Methodology</dochead>
      <bibl>
         <title>
            <p>Uncovering mechanisms of transcriptional regulations by systematic mining of cis regulatory elements with gene expression profiles</p>
         </title>
         <aug>
            <au ca="yes" id="A1">
               <snm>Ma</snm>
               <fnm>Qicheng</fnm>
               <insr iid="I1"/>
               <email>Qicheng.Ma@novartis.com</email>
            </au>
            <au id="A2">
               <snm>Chirn</snm>
               <fnm>Gung-Wei</fnm>
               <insr iid="I1"/>
               <email>Gung-Wei.Chirn@novartis.com</email>
            </au>
            <au id="A3">
               <snm>Szustakowski</snm>
               <mi>D</mi>
               <fnm>Joseph</fnm>
               <insr iid="I1"/>
               <email>Joseph.Szustakowski@novartis.com</email>
            </au>
            <au id="A4">
               <snm>Bakhtiarova</snm>
               <fnm>Adel</fnm>
               <insr iid="I2"/>
               <email>Adel.Bakhtiarova@novartis.com</email>
            </au>
            <au id="A5">
               <snm>Kosinski</snm>
               <mi>A</mi>
               <fnm>Penelope</fnm>
               <insr iid="I2"/>
               <email>Penny.Kosinski@novartis.com</email>
            </au>
            <au id="A6">
               <snm>Kemp</snm>
               <fnm>Daniel</fnm>
               <insr iid="I2"/>
               <email>Daniel.Kemp@novartis.com</email>
            </au>
            <au id="A7">
               <snm>Nirmala</snm>
               <fnm>Nanguneri</fnm>
               <insr iid="I1"/>
               <email>Nanguneri.Nirmala@novartis.com</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Developmental and Molecular Pathways, Novartis Institutes For Biomedical Research Inc, 250 Massachusetts Avenue, Cambridge, MA 02139, USA</p>
            </ins>
            <ins id="I2">
               <p>Cardiovascular and Metabolism Disease Area, Novartis Institutes For Biomedical Research Inc, 250 Massachusetts Avenue, Cambridge, MA 02139, USA</p>
            </ins>
         </insg>
         <source>BioData Mining</source>
         <issn>1756-0381</issn>
         <pubdate>2008</pubdate>
         <volume>1</volume>
         <issue>1</issue>
         <fpage>4</fpage>
         <url>http://www.biodatamining.org/content/1/1/4</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18822150</pubid>
               <pubid idtype="doi">10.1186/1756-0381-1-4</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>11</day>
               <month>1</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>17</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>17</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Ma et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Contrary to the traditional biology approach, where the expression patterns of a handful of genes are studied at a time, microarray experiments enable biologists to study the expression patterns of many genes simultaneously from gene expression profile data and decipher the underlying hidden biological mechanism from the observed gene expression changes. While the statistical significance of the gene expression data can be deduced by various methods, the biological interpretation of the data presents a challenge.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>A method, called CisTransMine, is proposed to help infer the underlying biological mechanisms for the observed gene expression changes in microarray experiments. Specifically, this method will predict potential cis-regulatory elements in promoter regions which could regulate gene expression changes. This approach builds on the MotifADE method published in 2004 and extends it with two modifications: up-regulated genes and down-regulated genes are tested separately and in addition, tests have been implemented to identify combinations of transcription factors that work synergistically. The method has been applied to a genome wide expression dataset intended to study myogenesis in a mouse C2C12 cell differentiation model. The results shown here both confirm the prior biological knowledge and facilitate the discovery of new biological insights.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>The results validate that the CisTransMine approach is a robust method to uncover the hidden transcriptional regulatory mechanisms that can facilitate the discovery of mechanisms of transcriptional regulation.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>High-throughput microarray experiments have modernized biological experiments by enabling measurements of expression levels for genes on the genome scale under different conditions. Hundreds or thousands of genes may be differentially expressed between conditions due to the effects of a variety of transcriptional factors or their co-factors. It is challenging to be able to interpret these changes in a biological context. Understanding the transcription regulation mechanisms between transcriptional factors and their target genes is one of the key ways to formulate hypotheses about the root causes of the observed changes.</p>
         <p>Unveiling mechanisms of transcription regulation is an active bioinformatics research area. Different approaches have been proposed to discover mechanisms of transcription regulation. Bayesian network approaches have been applied <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> to integrate motif discovery in promoters with the analysis of gene expression data. Some approaches <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> split motifs and gene expression values of regulators to build a decision tree based on the combination of expression ratios of transcription factors and presence/absence of the motifs. Yet other approaches <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp> fit gene expression data to a linear model using weights depending on whether a transcriptional factor is an inducer or repressor. Mootha <it>et. al</it>. <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> uses a two-tailed non-parametric Mann-Whitney (Wilcoxon) rank sum test to determine significance of motifs in promoter regions. The MotifADE method <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> assumes that if up-regulated or down-regulated genes which contain certain transcriptional factor binding sites are co-ordinately regulated, changes in their expression levels could be explained by those transcriptional factors. On the other hand, if genes which contain the same transcriptional factor binding sites are not co-ordinately regulated, there may not be any association between genes and transcriptional factors. In particular, the MotifADE algorithm works in three steps: (1) rank genes based on differential expression between two conditions using the signal-to-noise ratio as the difference metric in descending order (the signal-to-noise ratio is used as opposed to the fold change value based on the expression level since the former also takes into account the standard deviation); (2) For each motif, identify the group of genes whose promoter regions contains the motif; and finally (3) apply the two-tailed non-parametric Mann-Whitney rank sum test to determine if these genes tend to be enriched toward the top or bottom of the ranked list (indicating association) or tend to be randomly distributed on the list (indicating no association).</p>
         <p>In our hands, we have observed that two-tailed non-parametric Mann-Whitney rank sum tests used by MotifADE method cannot detect significances of transcriptional factors if they induce the transcription of some genes and repress the transcription of other genes at the same time (see discussion). We have therefore extended the MotifADE method to investigate up-regulated and down-regulated genes separately since a transcriptional factor may simultaneously enhance the transcription of certain genes and inhibit the transcription of other genes. We have also introduced a method to identify the synergistic effects between pairs of transcriptional factors. The CisTransMine method is applied to a mouse C2C12 differentiation dataset <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, where it implicates several known myogenic and cell cycle facts as well as a novel transcriptional factor binding site which regulates known target genes. These results demonstrate that the CisTransMine method is an important tool to discover unknown transcription regulation mechanisms, thus facilitating in extending biological knowledge.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Results for known transcriptional factors</p>
            </st>
            <p>We use the mouse C2C12 cell differentiation dataset as a test case <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. In this experiment, mouse C2C12 myoblast cells were induced to differentiate from myoblasts to myotubes in order to model late stage myogenesis. Cells were cultured in 6-well plates. Induction of differentiation of the C2C12 myoblasts was initiated at Day 0 when cells were confluent by reducing the serum concentration in the wells to 3% v/v. Upon induction of differentiation these mononucleate cells exited the cell cycle and fused to form myotubes. Cells were lysed for RNA preparation. The expression level was measured at eight time-points, with three replicates per time point at days -1, 0, 0.25, 1, 2, 3, 4, 5 post induction. The goal is to identify genes involved in myogenesis. Figure <figr fid="F1">1</figr> shows gene expression profiles across all time points. It can be observed that the major switch in the expression profiles occurs between Day 1 and Day 2.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Gene expression profiles across all time points</p>
               </caption>
               <text>
                  <p><b>Gene expression profiles across all time points</b>. Gene expression profiles across eight time points. It can be observed that major changes occur from Day 1 to Day 2.</p>
               </text>
               <graphic file="1756-0381-1-4-1"/>
            </fig>
            <p>The CisTransMine algorithm was run on this dataset comparing expression profiles between different time points. Table <tblr tid="T1">1</tblr> shows the top 15 transcriptional factors (TF) among up-regulated genes in muscle differentiation between the day 1 and day 2 time points. The top TF among the up-regulated genes is E12, also called E47, which forms heterodimers with MYOD, the second top TF among the up-regulated genes, and is pivotal in controlling muscle transcription <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. Figure <figr fid="F2">2</figr> shows the distribution of moderated t-values in up-regulated genes with the MYOD binding elements in their promoter regions. SRF (serum response factor) is required for skeletal muscle growth and maturation <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. The transcriptional factor C/EBP, which forms heterodimers with C-Jun denoted by CREBP1/CJUN, can activate differentiation-specific genes <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. MEF2, which is implicated in the muscle contraction process <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, is also enriched since the muscle contraction pathway is up-regulated <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. Several other top ranked TFs have not been previously linked to muscle and may warrant further investigation into their roles in myogenesis.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Significant transcriptional factors in up-regulated genes from Day 1 to Day 2</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="left">
                        <p>Motif</p>
                     </c>
                     <c ca="center">
                        <p>Occurrence Number</p>
                     </c>
                     <c ca="center">
                        <p>p-value</p>
                     </c>
                     <c ca="center">
                        <p>q-value</p>
                     </c>
                     <c ca="left">
                        <p>Transcription Factors</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>RRCAGGTGNCV</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>3.87E-05</p>
                     </c>
                     <c ca="center">
                        <p>0.00274</p>
                     </c>
                     <c ca="left">
                        <p>E12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SRACAGGTGKYG</p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                     <c ca="center">
                        <p>0.000125</p>
                     </c>
                     <c ca="center">
                        <p>0.00399</p>
                     </c>
                     <c ca="left">
                        <p>MYOD</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>RSTGACTNMNW</p>
                     </c>
                     <c ca="center">
                        <p>65</p>
                     </c>
                     <c ca="center">
                        <p>0.000253</p>
                     </c>
                     <c ca="center">
                        <p>0.00399</p>
                     </c>
                     <c ca="left">
                        <p>AP1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GGTACAANNTGTYCTK</p>
                     </c>
                     <c ca="center">
                        <p>34</p>
                     </c>
                     <c ca="center">
                        <p>0.000282</p>
                     </c>
                     <c ca="center">
                        <p>0.00399</p>
                     </c>
                     <c ca="left">
                        <p>GRE</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GGACATGCCCGGGCATGTCY</p>
                     </c>
                     <c ca="center">
                        <p>170</p>
                     </c>
                     <c ca="center">
                        <p>0.000306</p>
                     </c>
                     <c ca="center">
                        <p>0.00399</p>
                     </c>
                     <c ca="left">
                        <p>P53</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GGGGCGGGGT</p>
                     </c>
                     <c ca="center">
                        <p>245</p>
                     </c>
                     <c ca="center">
                        <p>0.000338</p>
                     </c>
                     <c ca="center">
                        <p>0.00399</p>
                     </c>
                     <c ca="left">
                        <p>SP1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>NNRYCACGTGRYNN</p>
                     </c>
                     <c ca="center">
                        <p>38</p>
                     </c>
                     <c ca="center">
                        <p>0.000422</p>
                     </c>
                     <c ca="center">
                        <p>0.00426</p>
                     </c>
                     <c ca="left">
                        <p>USF</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CTCTAAAAATAACYCY</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>0.000485</p>
                     </c>
                     <c ca="center">
                        <p>0.00429</p>
                     </c>
                     <c ca="left">
                        <p>MEF2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ATGCCCATATATGGWNNT</p>
                     </c>
                     <c ca="center">
                        <p>67</p>
                     </c>
                     <c ca="center">
                        <p>0.000605</p>
                     </c>
                     <c ca="center">
                        <p>0.00475</p>
                     </c>
                     <c ca="left">
                        <p>SRF</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GAAAAGYGAAASY</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>0.00178</p>
                     </c>
                     <c ca="center">
                        <p>0.0126</p>
                     </c>
                     <c ca="left">
                        <p>IRF2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>AGATADMAGGGA</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>0.0029</p>
                     </c>
                     <c ca="center">
                        <p>0.018</p>
                     </c>
                     <c ca="left">
                        <p>GATA4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CKSNYTAAAAAWRMCY</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0.00305</p>
                     </c>
                     <c ca="center">
                        <p>0.018</p>
                     </c>
                     <c ca="left">
                        <p>MMEF2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TGACGTYA</p>
                     </c>
                     <c ca="center">
                        <p>49</p>
                     </c>
                     <c ca="center">
                        <p>0.00389</p>
                     </c>
                     <c ca="center">
                        <p>0.0192</p>
                     </c>
                     <c ca="left">
                        <p>CREBP1/CJUN</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>RGCAGSTG</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>0.00398</p>
                     </c>
                     <c ca="center">
                        <p>0.0192</p>
                     </c>
                     <c ca="left">
                        <p>MYOGENIN</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GGGRATTTCC</p>
                     </c>
                     <c ca="center">
                        <p>75</p>
                     </c>
                     <c ca="center">
                        <p>0.0041</p>
                     </c>
                     <c ca="center">
                        <p>0.0192</p>
                     </c>
                     <c ca="left">
                        <p>NFKAPPAB65</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Significant transcriptional factors in up-regulated genes from Day 1 to Day 2. The Motif column shows the consensus binding site sequence for the transcriptional factor. The second column lists the total number genes containing that that transcriptional factor binding sites in the promoter regions. The p-value column illustrates the Mann-Whitney rank sum p-value. The q-value column shows the multiple testing corrected FDR q-value. The transcriptional factor column lists the name of the transcriptional factor which is known to bind to that motif. The total number of up-regulated Refseq genes with raw expression levels at least 100 in Day 2 is 3338.</p>
               </tblfn>
            </tbl>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>The distribution of moderated t-values for up-regulated genes containing Myod binding elements in the promoter regions</p>
               </caption>
               <text>
                  <p><b>The distribution of moderated t-values for up-regulated genes containing Myod binding elements in the promoter regions</b>. The top histogram shows the distribution of moderated t-values for up-regulated MYOD target genes (also depicted as blue dots in the scatter plot), and the bottom histogram shows the distribution of moderated t-values gene expression profiles across all time points for all other up-regulated genes (also depicted as grey dots in the scatter plot).</p>
               </text>
               <graphic file="1756-0381-1-4-2"/>
            </fig>
            <p>Table <tblr tid="T2">2</tblr> shows a list of statistically significant transcriptional factors in down-regulated genes from Day 1 and Day 2. It has been previously shown that myogenic differentiation in this model is accompanied by cell cycle arrest that is detectable at the transcript level <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. The results described here implicate a number of TFs that might drive the exit from cell cycle. The transcriptional factors E2F1 and MYC, known regulators of the cell cycle process, are the top enriched transcriptional factors among the down-regulated genes, which implicates E2F1 and MYC as drivers of the previously described cell cycle arrest <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. The cell cycle checkpoint gene P53 and several known mediators of P53 activity E2F as well as NFY are also among the top enriched TFs <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Foxm1, a gene critical for G1/S transition and essential for mitotic progression <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, is also identified by the method. Table <tblr tid="T3">3</tblr> illustrates significant synergistic transcriptional factors in down-regulated genes from Day 1 to Day 2. The top interaction pair of transcriptional factors are NFKAPPAB65 and MYC. NFKAPPAB subunits are known to interact with the promoter regions of several genes including MYC (identified here in synergy with NFKAPPAB), Cyclin D1, and SKP2. These interactions are dynamic and depend on the phosphorylation states of NFKAPPAB65 as well as the cell cycle phase <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Taken together, these results show that biologically relevant transcription factors involved in muscle differentiation also show statistical significance in the gene expression profiling experiment. Thus one can use CisTransMine to tease out important regulatory processes that are in play under a given perturbation to a system.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Significant transcriptional factors in down-regulated genes from Day 1 to Day 2</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="left">
                        <p>Motif</p>
                     </c>
                     <c ca="center">
                        <p>Occurrence Number</p>
                     </c>
                     <c ca="center">
                        <p>p-value</p>
                     </c>
                     <c ca="center">
                        <p>q-value</p>
                     </c>
                     <c ca="left">
                        <p>Transcription Factors</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>NKTSSCGC</p>
                     </c>
                     <c ca="center">
                        <p>116</p>
                     </c>
                     <c ca="center">
                        <p>4.47E-11</p>
                     </c>
                     <c ca="center">
                        <p>3.97E-09</p>
                     </c>
                     <c ca="left">
                        <p>E2F1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>RACCACGTGCTC</p>
                     </c>
                     <c ca="center">
                        <p>351</p>
                     </c>
                     <c ca="center">
                        <p>2.31E-07</p>
                     </c>
                     <c ca="center">
                        <p>1.03E-05</p>
                     </c>
                     <c ca="left">
                        <p>MYC/MAX</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GGGGCGGGGT</p>
                     </c>
                     <c ca="center">
                        <p>253</p>
                     </c>
                     <c ca="center">
                        <p>2.64E-06</p>
                     </c>
                     <c ca="center">
                        <p>7.82E-05</p>
                     </c>
                     <c ca="left">
                        <p>SP1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ARATKGAST</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>6.73E-06</p>
                     </c>
                     <c ca="center">
                        <p>0.000149</p>
                     </c>
                     <c ca="left">
                        <p>FOXM1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TRRCCAATSRN</p>
                     </c>
                     <c ca="center">
                        <p>95</p>
                     </c>
                     <c ca="center">
                        <p>1.56E-05</p>
                     </c>
                     <c ca="center">
                        <p>0.000278</p>
                     </c>
                     <c ca="left">
                        <p>NFY</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>NNCCACGTGNNN</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>0.000292</p>
                     </c>
                     <c ca="center">
                        <p>0.00415</p>
                     </c>
                     <c ca="left">
                        <p>NMYC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GGACATGCCCGGGCATGTCY</p>
                     </c>
                     <c ca="center">
                        <p>205</p>
                     </c>
                     <c ca="center">
                        <p>0.000327</p>
                     </c>
                     <c ca="center">
                        <p>0.00415</p>
                     </c>
                     <c ca="left">
                        <p>P53</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TGACGTYA</p>
                     </c>
                     <c ca="center">
                        <p>65</p>
                     </c>
                     <c ca="center">
                        <p>0.000496</p>
                     </c>
                     <c ca="center">
                        <p>0.00551</p>
                     </c>
                     <c ca="left">
                        <p>CREBP1/CJUN</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>NBTGGGTGGTCN</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>0.00142</p>
                     </c>
                     <c ca="center">
                        <p>0.014</p>
                     </c>
                     <c ca="left">
                        <p>GLI</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>NNNNNCCATNTWNNNWN</p>
                     </c>
                     <c ca="center">
                        <p>64</p>
                     </c>
                     <c ca="center">
                        <p>0.00248</p>
                     </c>
                     <c ca="center">
                        <p>0.02</p>
                     </c>
                     <c ca="left">
                        <p>YY1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GCHCDAMCCAG</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0.00916</p>
                     </c>
                     <c ca="center">
                        <p>0.0592</p>
                     </c>
                     <c ca="left">
                        <p>CP2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TGCTGAGTCAY</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0.00945</p>
                     </c>
                     <c ca="center">
                        <p>0.0592</p>
                     </c>
                     <c ca="left">
                        <p>NFE2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TCATGTGN</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>0.0124</p>
                     </c>
                     <c ca="center">
                        <p>0.0675</p>
                     </c>
                     <c ca="left">
                        <p>TFE</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TGACGTMA</p>
                     </c>
                     <c ca="center">
                        <p>90</p>
                     </c>
                     <c ca="center">
                        <p>0.0136</p>
                     </c>
                     <c ca="center">
                        <p>0.0675</p>
                     </c>
                     <c ca="left">
                        <p>CREB</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>TWSGCGCGAAAAYKR</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>0.0141</p>
                     </c>
                     <c ca="center">
                        <p>0.0675</p>
                     </c>
                     <c ca="left">
                        <p>E2F</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Significant transcriptional factors in down-regulated genes from Day 1 to Day 2. The Motif column shows the consensus binding site sequence for the transcriptional factor. The second column lists the total number of genes containing that motif in the promoter regions. The p-value column illustrates the Mann-Whitney rank sum p-value. The q-value column shows the multiple testing corrected FDR q-value. The last column lists the name of the transcription factor. The total number of down-regulated Refseq genes with raw expression levels at least 100 in Day 1 is 3728.</p>
               </tblfn>
            </tbl>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Significant synergistic transcriptional factors in down-regulated genes from Day 1 to Day 2</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="left">
                        <p>Motif</p>
                     </c>
                     <c ca="center">
                        <p>Occurrence Number</p>
                     </c>
                     <c ca="center">
                        <p>p-value</p>
                     </c>
                     <c ca="center">
                        <p>q-value</p>
                     </c>
                     <c ca="left">
                        <p>Transcription Factors</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>RACCACGTGCTC_GGGRATTTCC</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>0.00105</p>
                     </c>
                     <c ca="center">
                        <p>0.0122</p>
                     </c>
                     <c ca="left">
                        <p>CMYC NFKAPPAB65</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>HWAAATCAATAW_TRRCCAATSRN</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0.0028</p>
                     </c>
                     <c ca="center">
                        <p>0.0122</p>
                     </c>
                     <c ca="left">
                        <p>HNF6 NFY</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GCCNNNRGS_ACWTCCK</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0.00304</p>
                     </c>
                     <c ca="center">
                        <p>0.0122</p>
                     </c>
                     <c ca="left">
                        <p>AP2ALPHA PEA3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>AGWACATNWTGTTCT_SGGRNTTTCC</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0.00523</p>
                     </c>
                     <c ca="center">
                        <p>0.0157</p>
                     </c>
                     <c ca="left">
                        <p>AR CREL</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>AGACNBCNN_ASMCTTGGGSRGGG</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.00859</p>
                     </c>
                     <c ca="center">
                        <p>0.0189</p>
                     </c>
                     <c ca="left">
                        <p>SMAD SP3</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Significant synergistic transcriptional factors in down-regulated genes from Day 1 to Day 2. The Motif column shows the consensus binding site sequence for the transcriptional factor where two motifs are separated by an underscore. The second column lists the total occurrence number of genes containing that motif in the promoter regions. The p-value column illustrates the Mann-Whitney rank sum p-value. The q-value column shows the multiple testing corrected FDR q-value. The last column lists the name of the transcription factors. The total number of down-regulated Refseq genes with raw expression levels at least 100 in Day 1 is 3728.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Results for unknown transcriptional factors</p>
            </st>
            <p>This method was also used to discover novel regulatory elements from this experiment <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. The elucidation of novel regulatory motifs in the context of a specific cellular function may reveal new pathways and targetable mechanisms related to disease settings. In this paper, the terms "motifs" and "transcriptional factor binding sites" are used interchangeably. Motifs that emerged as potential regulatory elements with statistical significance were screened for functional relevance via luciferase assay. Specifically, motifs were selected in the context of the genes that have a known role in myogenic differentiation and functional pathways that are regulated such as contractility, cell cycle, and mRNA splicing in addition to their statistical significances. The 400 bp DNA sequence surrounding the chosen motifs were analyzed using Transfac for additional transcription factor binding sites, which could potentially influence and complex with the transcription factor identified to bind the unknown novel motif. Table <tblr tid="T4">4</tblr> lists the details for tested motifs and other known transcriptional factors within 400 bp DNA sequences surrounding the chosen motifs.</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Tested novel motifs with mutagenesis</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="center">
                        <p>Motif</p>
                     </c>
                     <c ca="center">
                        <p>Occurrence </p>
                        <p>number</p>
                     </c>
                     <c ca="center">
                        <p>p-value</p>
                     </c>
                     <c ca="center">
                        <p>Gene</p>
                        <p>symbol</p>
                     </c>
                     <c ca="center">
                        <p>Fold change</p>
                        <p>Ratio</p>
                     </c>
                     <c ca="center">
                        <p>Gene description</p>
                     </c>
                     <c ca="center">
                        <p>Known nearby Transcriptional</p>
                        <p>factor binding sites</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>gcggaggc</p>
                     </c>
                     <c ca="center">
                        <p>1238</p>
                     </c>
                     <c ca="center">
                        <p>2.57E-06</p>
                     </c>
                     <c ca="center">
                        <p>pck2</p>
                     </c>
                     <c ca="center">
                        <p>0.2</p>
                     </c>
                     <c ca="center">
                        <p>Phosphoenol-pyruvate</p>
                        <p>carboxykinase 2</p>
                        <p> (mitochondrial)</p>
                     </c>
                     <c ca="center">
                        <p>Oct-1, TFIIA</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>cgacccgt</p>
                     </c>
                     <c ca="center">
                        <p>95</p>
                     </c>
                     <c ca="center">
                        <p>3.60E-06</p>
                     </c>
                     <c ca="center">
                        <p>myog</p>
                     </c>
                     <c ca="center">
                        <p>5.2</p>
                     </c>
                     <c ca="center">
                        <p>myogenin</p>
                     </c>
                     <c ca="center">
                        <p>SREBP-1, MEF2, MEF3</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Novel motifs tested with mutagenesis and their surrounding known transcriptional factor binding sites.</p>
               </tblfn>
            </tbl>
            <p>To test for regulatory activity of selected motifs using a reporter gene assay approach, 400 bp sequences were generated by PCR using appropriate primers, and using XhoI restriction sites, these fragments were cloned into the pGL3 promoter reporter vector to assay their transcriptional activity. This relatively large promoter sequence was used due to the potential requirement for contextual surrounding elements for motif function/activity. A 400 bp fragment of the pck2 gene sounding the motif, GCGGAGGC, was cloned from the pck2 promoter into pGL3 promoter firefly luciferase vector and was used to transfect C2C12 myoblasts along with the pGL4.75 Renilla luciferase vector for transfection efficiency. The cells were then split into two plates, cells on one plate were induced to differentiate and the other plate was maintained as undifferentiated myoblasts. Cells transfected with the pGL3 promoter vector without the construct (control), expressed some reporter gene activity, and that reporter activity increased eight fold over the control in the cells transfected with the same vector containing the 400 bp pck2 gene promoter fragment containing the motif, GCGGAGGC (Figure <figr fid="F3">3A</figr>). In order to assess the activity specifically mediated by the motif, the sequence was mutated by random nucleotide substitution, and two different mutant sequences were generated, mutant1 (acgctatc) and mutant2 (ctgcacgc). These mutations led to an increase in the reporter activity beyond that of the wild type motif/promoter, up to twelve fold compared to control. The potential function of this motif, as a negative regulator of gene expression, is consistent with the expression pattern of the pck2 gene within the myogenic program. In contrast, reporter gene activity in C2C12 cells transfected with the pGL4.15 basic vector containing 400 bp of the myogenin promoter with the motif CGACCCGT did not change after mutations were introduced (Figure <figr fid="F3">3B</figr>). Thus, it was deemed that this particular motif has no functional role in the myogenin promoter.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Luciferase reporter assay results</p>
               </caption>
               <text>
                  <p><b>Luciferase reporter assay results</b>. Reporter gene assay of pck2 400 bp fragment containing GCGGAGGC motif (A) and myogenin fragment containing CGACCCGT motif (B). There is a change in the reporter activity upon mutagenesis in pck2 constuct and there is no change in myogenin construct. Data normalized to corresponding myoblasts or myotubes transfected with pGL3 promoter vector in the case of pck2 assay (A), Myogenin data was normalized to myoblasts transfected with pGL4.15 containing myogenin construct, because pGL4.15 alone does not have any basal activity. Data represents at least three replicates &#177; s.e.m. (*, p &lt; 0.05, t-test).</p>
               </text>
               <graphic file="1756-0381-1-4-3"/>
            </fig>
            <p>This experiment demonstrated the potential of this method to successfully identify novel functional motifs. Such an approach may be extended to differential gene expression within a variety of disease-related settings and cell types, with potential relevance to disease pathway discovery.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>In the post-genomics area, there is a sea of biological data including microarray experimental data. This provides an unprecedented opportunity and challenge to fully decipher the underlying biological system. One aspect of this analysis is to analyze significantly enriched pathways where coordinated but sometimes subtle expression changes are observed among genes <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Though the pathway analysis provides a way to see "forests, not individual trees", it can not address the transcription regulation mechanisms which govern the observed gene expression level changes. Thus deciphering transcription regulation mechanisms help characterize the underlying biological process. Different approaches have been proposed to help decipher transcription regulation mechanisms including Bayesian networks, decision trees, and regression models. In this paper, the CisTransMine method has been implemented to identify transcriptional factors involved in biological processes through the analysis of microarray data.</p>
         <p>The CisTransMine method not only confirms some known biological knowledge but also reveals potentially novel biological insights. Compared to the results generated by the two-tailed non-parametric Mann-Whitney rank sum test, as used by the MotifADE method shown in Table <tblr tid="T5">5</tblr>, the CisTransMine method can also identify the transcriptional factors MYOD, AP1, P53, SP1, USF, IRF2, CREBP1/CJUN, and NFKAPPAB65 from up-regulated genes from Day 1 to Day 2 and the transcriptional factors SP1, P53, CREBP1/CJUN, YY1, CP2, NFE2, and TFE from down-regulated genes. Among these transcriptional factors, P53, SP1, and CREBP1/CJUN are significant in both up-regulated genes and down-regulated genes from Day 1 to Day 2 and were missed by the two-tailed non-parametric Mann-Whitney rank sum tests. CisTransMine also identifies additional enriched transcriptional factors which are not supported currently linked to myogenesis (<it>e.g</it>. NMYC). CisTransMine did not identify several TFs identified by MotifADE, including HNF4ALPHA and EVI1, and also missed the interaction between E12 and MYOD among up-regulated genes from Day 1 to Day 2. Moreover, only 7066 genes were included in these calculations. As additional transcriptional factors and their target genes are discovered, we will have more coverage on the transcriptional regulation relationships which will result in more comprehensive prediction results.</p>
         <tbl id="T5">
            <title>
               <p>Table 5</p>
            </title>
            <caption>
               <p>Significant transcriptional factors identified by the two-tailed non-parametric </p>
            </caption>
            <tblbdy cols="5">
               <r>
                  <c ca="left">
                     <p>Motif</p>
                  </c>
                  <c ca="center">
                     <p>Occurrence Number</p>
                  </c>
                  <c ca="center">
                     <p>p-value</p>
                  </c>
                  <c ca="center">
                     <p>q-value</p>
                  </c>
                  <c ca="left">
                     <p>Transcription Factors</p>
                  </c>
               </r>
               <r>
                  <c cspan="5">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NKTSSCGC</p>
                  </c>
                  <c ca="center">
                     <p>161</p>
                  </c>
                  <c ca="center">
                     <p>4.28E-12</p>
                  </c>
                  <c ca="center">
                     <p>7.49E-10</p>
                  </c>
                  <c ca="left">
                     <p>E2F1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>RACCACGTGCTC</p>
                  </c>
                  <c ca="center">
                     <p>575</p>
                  </c>
                  <c ca="center">
                     <p>3.92E-08</p>
                  </c>
                  <c ca="center">
                     <p>3.43E-06</p>
                  </c>
                  <c ca="left">
                     <p>MYC/MAX</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>ARATKGAST</p>
                  </c>
                  <c ca="center">
                     <p>15</p>
                  </c>
                  <c ca="center">
                     <p>1.97E-06</p>
                  </c>
                  <c ca="center">
                     <p>0.000115</p>
                  </c>
                  <c ca="left">
                     <p>FOXM1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>RGCAGSTG</p>
                  </c>
                  <c ca="center">
                     <p>15</p>
                  </c>
                  <c ca="center">
                     <p>4.79E-06</p>
                  </c>
                  <c ca="center">
                     <p>0.00021</p>
                  </c>
                  <c ca="left">
                     <p>MYOGENIN</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>VTGAACTTTGMMB</p>
                  </c>
                  <c ca="center">
                     <p>1217</p>
                  </c>
                  <c ca="center">
                     <p>4.24E-05</p>
                  </c>
                  <c ca="center">
                     <p>0.00149</p>
                  </c>
                  <c ca="left">
                     <p>HNF4ALPHA</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>AGATADMAGGGA</p>
                  </c>
                  <c ca="center">
                     <p>19</p>
                  </c>
                  <c ca="center">
                     <p>0.000158</p>
                  </c>
                  <c ca="center">
                     <p>0.00462</p>
                  </c>
                  <c ca="left">
                     <p>GATA4</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>ATGCCCATATATGGWNNT</p>
                  </c>
                  <c ca="center">
                     <p>111</p>
                  </c>
                  <c ca="center">
                     <p>0.000203</p>
                  </c>
                  <c ca="center">
                     <p>0.00507</p>
                  </c>
                  <c ca="left">
                     <p>SRF</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TWSGCGCGAAAAYKR</p>
                  </c>
                  <c ca="center">
                     <p>10</p>
                  </c>
                  <c ca="center">
                     <p>0.000247</p>
                  </c>
                  <c ca="center">
                     <p>0.00514</p>
                  </c>
                  <c ca="left">
                     <p>E2F</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>TRRCCAATSRN</p>
                  </c>
                  <c ca="center">
                     <p>159</p>
                  </c>
                  <c ca="center">
                     <p>0.000264</p>
                  </c>
                  <c ca="center">
                     <p>0.00514</p>
                  </c>
                  <c ca="left">
                     <p>NFY</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NNCCACGTGNNN</p>
                  </c>
                  <c ca="center">
                     <p>15</p>
                  </c>
                  <c ca="center">
                     <p>0.000495</p>
                  </c>
                  <c ca="center">
                     <p>0.00867</p>
                  </c>
                  <c ca="left">
                     <p>NMYC</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>CTCTAAAAATAACYCY</p>
                  </c>
                  <c ca="center">
                     <p>14</p>
                  </c>
                  <c ca="center">
                     <p>0.000618</p>
                  </c>
                  <c ca="center">
                     <p>0.00984</p>
                  </c>
                  <c ca="left">
                     <p>MEF2</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>GGTACAANNTGTYCTK</p>
                  </c>
                  <c ca="center">
                     <p>55</p>
                  </c>
                  <c ca="center">
                     <p>7.00E-04</p>
                  </c>
                  <c ca="center">
                     <p>0.0102</p>
                  </c>
                  <c ca="left">
                     <p>GRE</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>NBTGGGTGGTCN</p>
                  </c>
                  <c ca="center">
                     <p>15</p>
                  </c>
                  <c ca="center">
                     <p>0.00191</p>
                  </c>
                  <c ca="center">
                     <p>0.023</p>
                  </c>
                  <c ca="left">
                     <p>GLI</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>RRCAGGTGNCV</p>
                  </c>
                  <c ca="center">
                     <p>27</p>
                  </c>
                  <c ca="center">
                     <p>0.00197</p>
                  </c>
                  <c ca="center">
                     <p>0.023</p>
                  </c>
                  <c ca="left">
                     <p>E12</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>ACAAGATAA</p>
                  </c>
                  <c ca="center">
                     <p>7</p>
                  </c>
                  <c ca="center">
                     <p>0.00269</p>
                  </c>
                  <c ca="center">
                     <p>0.0288</p>
                  </c>
                  <c ca="left">
                     <p>EVI1</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>Mann-Whitney rank sum tests from Day 1 to Day 2</p>
               <p/>
               <p>Significant transcriptional factors identified by the two-tailed non-parametric Mann-Whitney rank sum tests </p>
               <p>from Day 1 to Day 2. The Motif column shows the consensus binding site sequence for the transcriptional factor. The second column lists the total number genes containing that that transcriptional factor binding sites in the promoter regions. The p-value column illustrates the two-tailed Mann-Whitney rank sum p-value. The q-value column shows the multiple testing corrected FDR q-value. The transcriptional factor column lists the name of the transcriptional factor which is known to bind to that motif. The total number of Refseq genes with raw expression levels at least 100 in either Day 1 or Day 2 is 7066.</p>
            </tblfn>
         </tbl>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>In summary, preliminary results identified the relevant transcriptional factors involved in a mouse C2C12 cell model of myogenesis, demonstrating the potential of this method to identify the transcriptional regulatory mechanisms in profiling experiments. We expect that the application of this method to other systems will yield similar results and lead to novel hypotheses regarding the roles of various transcription factors in specific biological systems.</p>
         <p>The CisTransMine method was implemented in R, Perl, and C++ and is available upon request. The CisTransMine method was applied to a gene expression profiling experiment of mouse C2C12 skeletal muscle myoblast differentiation to myotubes. The dataset is available from NCBI GEO database <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Preparation for promoter sequences</p>
            </st>
            <p>The human, mouse and rat promoter sequences were extracted from the genome assembly as of January 2008. The location of the transcriptional start site was approximated by the first nucleotide in the RefSeq mRNA transcript sequence. For each gene, promoter sequences with respect to their transcripts were extracted according to coordinates of first exons for corresponding transcripts. For each transcript, the region from -2000 bp to +300 bp with respect to the transcriptional start site was extracted. A gene may have several different transcripts, therefore several promoters.</p>
            <p>The promoter sequences were masked against repetitive sequences, e.g., LINEs and SINEs with the RepeatMasker program to avoid any Transfac version 11.4 <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> matrix search hits in those repetitive regions. Then orthologous promoter sequences were aligned together with Wconsensus <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. The orthologous relationships were defined in the NCBI Homologene database as of March 2008. For those promoters with orthologous promoters in human, mouse and rat, a sliding window of 10 nucleotides was used and non-conserved regions were masked out where promoter sequence identities among orthologous promoter sequences had a length of less than 5 nucleotides within a 10 nucleotide window.</p>
         </sec>
         <sec>
            <st>
               <p>Annotation of promoter sequences</p>
            </st>
            <p>Human-curated transcriptional factor binding sites from the Transfac database were used to record each transcription factor and its regulated genes for human sequences. In addition, the GeneGo Metacore database version 4.6 <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> was used to identify each transcriptional factor and its regulated genes. The Metacore database also reports whether the relationship is the activation or inhibition effect by the transcription regulation, e.g., the human P53 gene regulates 609 target genes by the transcription regulation: among these 609 genes, it transcriptionally activates 206 genes and inhibits 84 genes. Its nature of its interactions with the remaining 319 genes is not explicitly stated. In total, there are a total of 822 human transcriptional factors, 649 mouse transcriptional factors, and 386 rat transcriptional factors in our collection.</p>
         </sec>
         <sec>
            <st>
               <p>Extraction of unknown transcriptional factor binding sites</p>
            </st>
            <p>Promoter sequence regions which have been annotated as known transcriptional factor binding sites were masked out. The remaining regions contain potentially novel transcriptional factor binding sites. All possible non-degenerative conserved 8-mer and 9-mer motifs which have at least 5 identical nucleotides within a 10 nucleotide window among human, mouse and rat promoter sequences were enumerated. Their true significance would be evaluated in biological experiments.</p>
         </sec>
         <sec>
            <st>
               <p>Normalization of affymetrix genechip arrays</p>
            </st>
            <p>Affymetrix mouse 430 version 2 microarrays were used to measure gene expression values. Normalization in our analysis was carried out using the GC-RMA normalization method <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Values were exponentiated (base 2) to return them to a linear scale and scaled to a 2% trimmed mean of 150. We removed probe sets which have average raw values among replicates less than 100 for both conditions.</p>
         </sec>
         <sec>
            <st>
               <p>Calculation of the moderated t statistic for each probe set</p>
            </st>
            <p>The traditional student t-test statistic is often used to assess the significance of individual probe sets between two conditions, e.g., treatment group versus control group. However, there are usually only a few replicates (usually three) within each group. Given such a small sample size, it is difficult to estimate the variance reliably. This makes the estimation of the t-statistic problematic. To address this problem, the moderated t-test <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> implemented in the Limma package within the Bioconductor package <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> is adopted to evaluate the significance of individual probe sets between the two groups. The moderated t-test assumes the same distribution for the error variance of all genes in order to estimate the variance of an individual gene with an empirical Bayes method, using posterior residual standard deviations instead of traditional standard deviations, to accommodate for the low number of replicates for each group <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Up-regulated genes and down-regulated genes have positive and negative moderated t-values respectively. If a gene is represented by several probe sets, the moderated t-statistic with the highest absolute value is used to represent the moderated t- statistic for that gene.</p>
         </sec>
         <sec>
            <st>
               <p>Evaluation of the significance of a single motif</p>
            </st>
            <p>The CisTransMine method extends the MotifADE framework to identify significant transcriptional factor binding sites enriched between two microarray conditions. MotifADE uses a two-tailed non-parametric Mann-Whitney rank sum U statistic to evaluate the significance of a motif. Specifically, for each motif, t-statistics for all the genes are divided into two groups: one group containing t-statistics for genes having the motif of interest in their promoter region and the other group for genes not having the motif in their promoter regions. The null hypothesis is that there is no difference between the means of the ranks of these two sets of t-statistics; the alternative hypothesis is that the means of the ranks of these two sets are not equal, i.e., genes containing the motif are either up-regulated or down-regulated (Figure <figr fid="F4">4</figr>).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>MotifADE overview</p>
               </caption>
               <text>
                  <p><b>MotifADE overview</b>. Overview of MotifADE method: Genes are sorted by their moderated t-test statistic values. Motifs in the promoter regions in these genes are identified. Two-tailed Mann-Whitney rank sum statistics is applied. In this schematic view, Motif 1 is significant in the up-regulated genes. Motif 2 is not significant in either the up-regulated genes or down-regulated genes and Motif 3 is significant in the down-regulated genes.</p>
               </text>
               <graphic file="1756-0381-1-4-4"/>
            </fig>
            <p>In the case where a transcriptional factor may enhance the transcription of certain genes and repress the transcription of other genes at the same time, the two-tailed Mann-Whitney test might obscure such contexts. Under this situation, a two-tailed Mann-Whitney test cannot detect the significance of that motif since the two-tailed Mann-Whitney test calculates for a given motif, the rank sum for all genes having that motif regardless of up-regulated genes, down-regulated genes, and non-regulated genes. If there are an approximately equal number of up- and down-regulated genes with a particular motif, the statistical significance of the up-regulated genes will be more or less cancelled out by the statistical significance of the down-regulated genes. As a result the motif contained in those genes will be computed to be statistically insignificant. For example, in Figure <figr fid="F5">5</figr>, Motif 1 and Motif 3 would have the same p-values with the two-tailed Mann-Whitney test since only the t-value 0.9 is important and all other t-values from Motif 1 or Motif 3 are symmetric with respect to 0 contributing the same to the rank sum as does t-value 0 even though Motif 1 is more significant than Motif 3, as there are several genes containing motif 1 that are more highly down- or up-regulted relative to the extremes of the genes containing motif 3.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Problems with the two-tailed Mann-Whitney test</p>
               </caption>
               <text>
                  <p><b>Problems with the two-tailed Mann-Whitney test</b>. Motif 1 and Motif 3 would have the same p-values with the two-tailed Mann-Whitney test since only the t-value 0.9 is important and all other t-values from Motif 1 or Motif 3 are symmetric with respect to 0 contributing the same to the rank sum as does t-value 0 even though genes in Motif 1 show higher magnitude changes than genes in Motif 3.</p>
               </text>
               <graphic file="1756-0381-1-4-5"/>
            </fig>
            <p>An approach using absolute values was implemented to solve this problem <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> where the absolute enrichment can identify important gene sets that may not be identified by two-tailed methods. The CisTransMine method is proposed to test up-regulated genes and down-regulated genes separately for statistical significance by using the one-tailed non-parametric Mann-Whitney test. For up-regulated (and down-regulated respectively) genes, the null hypothesis is that the mean of the ranks in the up-regulated (and down-regulated respectively) genes containing the motif is equal to the mean of ranks in the up-regulated (and down-regulated respectively) genes not containing the motif; the alternative hypothesis is that the mean of ranks in the up-regulated (and down-regulated respectively) genes containing the motif is greater than (less than respectively) the mean of ranks in the up-regulated (and down-regulated respectively) genes not containing the motif. Thus, significances for motifs in up-regulated genes and down-regulated genes are tested separately.</p>
         </sec>
         <sec>
            <st>
               <p>Synergistic motifs</p>
            </st>
            <p>In eukaryotic genomes, a synergistic relationship is present when multiple transcriptional factors work in concert to regulate target genes, e.g., combinatorial activities of multiple transcriptional factors regulate the B cell lineage commitment and differentiation <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. In the CisTransMine method, synergistic relationships between two transcriptional factors are detected in a two-step process. First, the genes containing transcriptional factor A binding sites (TF<sub>A</sub>) and transcriptional factor B binding sites (TF<sub>B</sub>) in the promoter regions can be denoted by TF<sub>A </sub>&#8745; TF<sub>B</sub>, which is a subset of genes containing both types of binding sites. All the genes containing transcriptional factor A binding sites but not transcriptional factor B binding sites can be denoted by TF<sub>A</sub>- TF<sub>B</sub>. All the genes containing transcriptional factor B binding sites but not transcriptional factor A binding sites can be denoted by TF<sub>B</sub>- TF<sub>A</sub>. For up-regulated (and down-regulated respectively) genes, the necessary conditions for the true synergy between two transcriptional factors to exist are that (1) one-tailed Mann Whitney rank sum test P-value between genes in the set of TF<sub>A </sub>&#8745; TF<sub>B </sub>and the genes in the set of TF<sub>A</sub>- TF<sub>B </sub>is less than 0.05, (2) one-tailed Mann Whitney rank sum test P-value between genes in the set of TF<sub>A </sub>&#8745; TF<sub>B </sub>and the genes in the set of TF<sub>B</sub>- TF<sub>A</sub>, is less than 0.05. If the necessary conditions are satisfied, the algorithm proceeds to the second step where the significance of the synergistic relationship between the two transcriptional factors is tested with the same method as that for the single motif with the one-tailed Mann-Whitney rank sum test.</p>
         </sec>
         <sec>
            <st>
               <p>Multiple testing correction</p>
            </st>
            <p>In order to reduce the false positive rate, multiple testing correction method must be applied to take into account that thousands of null hypotheses are tested at the same time. The multiple testing correction method we adopt is the False Discovery Rate (FDR) q-value <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. The FDR q-value is a measure of the rate of false discovery from the distribution of p-values. The FDR q-value method is chosen since it can balance between the specificity and the sensitivity without <it>a priori </it>p-value cutoff (see reference for details).</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The authors declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>QM carried out the design and implementation of the algorithm and wrote the manuscript. GWC provided the mapping of the Affymetrix probeset to the NCBI refseq sequence. JDS did the quality control of the Affymetrix chips. AB, PAK, and DK did the wet lab work. NRN directed and participated in the project. All authors involved in reviewing and revising the manuscript and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Liam O'Connor and Richard Cai for reviewing the manuscript and Leah Martell for statistical advice.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Genome-wide Discovery of Transcriptional Modules from DNA Sequence and Gene Expression</p>
            </title>
            <aug>
               <au>
                  <snm>Segal</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Yelensky</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Koller</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>Suppl 1</issue>
            <fpage>i273</fpage>
            <lpage>282</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg1038</pubid>
                  <pubid idtype="pmpid" link="fulltext">12855470</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Predicting Genetic Regulatory Response Using Classification</p>
            </title>
            <aug>
               <au>
                  <snm>Middendorf</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kundaje</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wiggins</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Freund</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Leslie</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <issue>Suppl 1</issue>
            <fpage>i232</fpage>
            <lpage>240</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15262804</pubid>
                  <pubid idtype="doi">10.1093/bioinformatics/bth923</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>A stochastic differential equation model for quantifying transcriptional regulatory network in Saccharomyces cerevisiae</p>
            </title>
            <aug>
               <au>
                  <snm>Chen</snm>
                  <fnm>KC</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>TY</fnm>
               </au>
               <au>
                  <snm>Tseng</snm>
                  <fnm>HH</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>CYF</fnm>
               </au>
               <au>
                  <snm>Kao</snm>
                  <fnm>CY</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>12</issue>
            <fpage>2883</fpage>
            <lpage>2890</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15802287</pubid>
                  <pubid idtype="doi">10.1093/bioinformatics/bti415</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Reverse engineering gene networks using singular value decomposition and robust regression</p>
            </title>
            <aug>
               <au>
                  <snm>Stephen Yeung</snm>
                  <fnm>MK</fnm>
               </au>
               <au>
                  <snm>Tegn&#233;r</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Collins</snm>
                  <fnm>JJ</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <issue>9</issue>
            <fpage>6163</fpage>
            <lpage>6168</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">11983907</pubid>
                  <pubid idtype="doi">10.1073/pnas.092576199</pubid>
                  <pubid idtype="pmcid">122920</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Erralpha and Gabpa/b specify PGC-1alpha-dependent oxidative phosphorylation gene expression that is altered in diabetic muscle</p>
            </title>
            <aug>
               <au>
                  <snm>Mootha</snm>
                  <fnm>VK</fnm>
               </au>
               <au>
                  <snm>Handschin</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Arlow</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Xie</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>St Pierre</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Sihag</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Altshuler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Puigserver</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Patterson</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Willy</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Schulman</snm>
                  <fnm>IG</fnm>
               </au>
               <au>
                  <snm>Heyman</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Spiegelman</snm>
                  <fnm>BM</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2004</pubdate>
            <volume>101</volume>
            <issue>17</issue>
            <fpage>6570</fpage>
            <lpage>6575</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15100410</pubid>
                  <pubid idtype="doi">10.1073/pnas.0401401101</pubid>
                  <pubid idtype="pmcid">404086</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Identification of novel pathway regulation during myogenic differentiation</p>
            </title>
            <aug>
               <au>
                  <snm>Szustakowski</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Marrese</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Kosinski</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Nirmala</snm>
                  <fnm>NR</fnm>
               </au>
               <au>
                  <snm>Kemp</snm>
                  <fnm>DM</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>2006</pubdate>
            <volume>87</volume>
            <issue>1</issue>
            <fpage>129</fpage>
            <lpage>138</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16300922</pubid>
                  <pubid idtype="doi">10.1016/j.ygeno.2005.08.009</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>E47 phosphorylation by p38 MAPK promotes MyoD/E47 association and muscle-specific gene transcription</p>
            </title>
            <aug>
               <au>
                  <snm>Llu&#237;s</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Ballestar</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Suelves</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Esteller</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mu&#241;oz-C&#225;noves</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>2005</pubdate>
            <volume>24</volume>
            <issue>5</issue>
            <fpage>974</fpage>
            <lpage>84</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15719023</pubid>
                  <pubid idtype="doi">10.1038/sj.emboj.7600528</pubid>
                  <pubid idtype="pmcid">554117</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Requirement for serum response factor for skeletal muscle growth and maturation revealed by tissue-specific gene deletion in mice</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Czubryt</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>McAnally</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bassel-Duby</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Richardson</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Wiebel</snm>
                  <fnm>FF</fnm>
               </au>
               <au>
                  <snm>Nordheim</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Olson</snm>
                  <fnm>EN</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <issue>4</issue>
            <fpage>1082</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15647354</pubid>
                  <pubid idtype="doi">10.1073/pnas.0409103102</pubid>
                  <pubid idtype="pmcid">545866</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Molecular stop signs: regulation of cell-cycle arrest by C/EBP transcription factors</p>
            </title>
            <aug>
               <au>
                  <snm>Johnson</snm>
                  <fnm>PF</fnm>
               </au>
            </aug>
            <source>J Cell Sci</source>
            <pubdate>2005</pubdate>
            <volume>118</volume>
            <issue>12</issue>
            <fpage>2545</fpage>
            <lpage>55</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15944395</pubid>
                  <pubid idtype="doi">10.1242/jcs.02459</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>NF-kappaB, MEF2A, MEF2D and HIF1-a involvement on insulin- and contraction-induced regulation of GLUT4 gene expression in soleus muscle</p>
            </title>
            <aug>
               <au>
                  <snm>Silva</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Giannocco</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Furuya</snm>
                  <fnm>DT</fnm>
               </au>
               <au>
                  <snm>Lima</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Moraes</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Nachef</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bordin</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Britto</snm>
                  <fnm>LR</fnm>
               </au>
               <au>
                  <snm>Nunes</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Machado</snm>
                  <fnm>UF</fnm>
               </au>
            </aug>
            <source>Mol Cell Endocrinol</source>
            <pubdate>2005</pubdate>
            <volume>240</volume>
            <issue>1&#8211;2</issue>
            <fpage>82</fpage>
            <lpage>93</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16024167</pubid>
                  <pubid idtype="doi">10.1016/j.mce.2005.05.006</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>The promoters of human cell cycle genes integrate signals from two tumor suppressive pathways during cellular transformation</p>
            </title>
            <aug>
               <au>
                  <snm>Tabach</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Milyavsky</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Shats</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Brosh</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Zuk</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Yitzhaky</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mantovani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Domany</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rotter</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Pilpel</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Mol Syst Biol</source>
            <pubdate>2005</pubdate>
            <volume>1</volume>
            <fpage>0022</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16729057</pubid>
                  <pubid idtype="doi">10.1038/msb4100030</pubid>
                  <pubid idtype="pmcid">1681464</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Forkhead box M1 regulates the transcriptional network of genes essential for mitotic progression and genes encoding the SCF (Skp2-Cks1) ubiquitin ligase</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>IC</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>YJ</fnm>
               </au>
               <au>
                  <snm>Hughes</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Petrovic</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Major</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Tan</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Ackerson</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Costa</snm>
                  <fnm>RH</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>2005</pubdate>
            <volume>25</volume>
            <issue>24</issue>
            <fpage>10875</fpage>
            <lpage>94</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16314512</pubid>
                  <pubid idtype="doi">10.1128/MCB.25.24.10875-10894.2005</pubid>
                  <pubid idtype="pmcid">1316960</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>A cell cycle regulatory network controlling NF-kappaB subunit activity and function</p>
            </title>
            <aug>
               <au>
                  <snm>Barre</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Perkins</snm>
                  <fnm>ND</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>2007</pubdate>
            <volume>26</volume>
            <issue>23</issue>
            <fpage>4841</fpage>
            <lpage>55</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">17962807</pubid>
                  <pubid idtype="doi">10.1038/sj.emboj.7601899</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Pathways to the analysis of microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Curtis</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Oresic</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Vidal-Puig</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Trends Biotechnol</source>
            <pubdate>2005</pubdate>
            <volume>23</volume>
            <issue>8</issue>
            <fpage>429</fpage>
            <lpage>35</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15950303</pubid>
                  <pubid idtype="doi">10.1016/j.tibtech.2005.05.011</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Mouse C2C12 differentiation time course dataset</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE11415</url>
         </bibl>
         <bibl id="B16">
            <title>
               <p>TRANSFAC: transcriptional regulation, from patterns to profiles</p>
            </title>
            <aug>
               <au>
                  <snm>Matys</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Fricke</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Geffers</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>G&#246;&#223;ling</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Haubrock</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hehl</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hornischer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Karas</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kel</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Kel-Margoulis</snm>
                  <fnm>OV</fnm>
               </au>
               <au>
                  <snm>Kloos</snm>
                  <fnm>DU</fnm>
               </au>
               <au>
                  <snm>Land</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lewicki-Potapov</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Michael</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>M&#252;nch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Reuter</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Rotert</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Saxel</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Scheer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Thiele</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wingender</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>1</issue>
            <fpage>374</fpage>
            <lpage>378</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">12520026</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg108</pubid>
                  <pubid idtype="pmcid">165555</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Identifying DNA and protein patterns with statistically significant alignments of multiple sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Hertz</snm>
                  <fnm>GZ</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <issue>7&#8211;8</issue>
            <fpage>563</fpage>
            <lpage>77</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">10487864</pubid>
                  <pubid idtype="doi">10.1093/bioinformatics/15.7.563</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Pathway mapping tools for analysis of high content data</p>
            </title>
            <aug>
               <au>
                  <snm>Ekins</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nikolsky</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Bugrim</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kirillov</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Nikolskaya</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Methods Mol Biol</source>
            <pubdate>2007</pubdate>
            <volume>356</volume>
            <fpage>319</fpage>
            <lpage>50</lpage>
            <xrefbib>
               <pubid idtype="pmpid">16988414</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Comparison of Affymetrix GeneChip expression measures</p>
            </title>
            <aug>
               <au>
                  <snm>Irizarry</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>JZ</fnm>
               </au>
               <au>
                  <snm>Jaffee</snm>
                  <fnm>HA</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>22</volume>
            <issue>7</issue>
            <fpage>789</fpage>
            <lpage>794</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1093/bioinformatics/btk046</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments</p>
            </title>
            <aug>
               <au>
                  <snm>Smyth</snm>
                  <fnm>GK</fnm>
               </au>
            </aug>
            <source>Statistical Applications in Genetics and Molecular Biology</source>
            <pubdate>2004</pubdate>
            <volume>3</volume>
            <issue>1</issue>
            <fpage>Article 3</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.2202/1544-6115.1027</pubid>
                  <pubid idtype="pmpid">16646809</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Bioconductor: open software development for computational biology and bioinformatics</p>
            </title>
            <aug>
               <au>
                  <snm>Gentleman</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Carey</snm>
                  <fnm>VJ</fnm>
               </au>
               <au>
                  <snm>Bates</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Bolstad</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Dettling</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dudoit</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ellis</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gautier</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Ge</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Gentry</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hornik</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hothorn</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Huber</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Iacus</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Irizarry</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Leisch</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Maechler</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rossini</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Sawitzki</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Smyth</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Tierney</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>JY</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>10</issue>
            <fpage>R80</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15461798</pubid>
                  <pubid idtype="doi">10.1186/gb-2004-5-10-r80</pubid>
                  <pubid idtype="pmcid">545600</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Absolute enrichment: gene set enrichment analysis for homeostatic systems</p>
            </title>
            <aug>
               <au>
                  <snm>Saxena</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Orgill</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kohane</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>22</issue>
            <fpage>e151</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">17130162</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl766</pubid>
                  <pubid idtype="pmcid">1702493</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>The transcriptional regulation of B cell lineage commitment</p>
            </title>
            <aug>
               <au>
                  <snm>Nutt</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Kee</snm>
                  <fnm>BL</fnm>
               </au>
            </aug>
            <source>Immunity</source>
            <pubdate>2007</pubdate>
            <volume>26</volume>
            <issue>6</issue>
            <fpage>715</fpage>
            <lpage>25</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">17582344</pubid>
                  <pubid idtype="doi">10.1016/j.immuni.2007.05.010</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Statistical significance for genomewide studies</p>
            </title>
            <aug>
               <au>
                  <snm>Storey</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <issue>16</issue>
            <fpage>9440</fpage>
            <lpage>9445</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">12883005</pubid>
                  <pubid idtype="doi">10.1073/pnas.1530509100</pubid>
                  <pubid idtype="pmcid">170937</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>

