Open Access Open Badges Research

Large scale analysis of positional effects of single-base mismatches on microarray gene expression data

Fenghai Duan1*, Mark A Pauley2, Eliot R Spindel3, Li Zhang4 and Robert B Norgren5*

Author Affiliations

1 Center for Statistical Sciences, Brown University, Providence, RI, USA

2 College of Information Science & Technology, University of Nebraska at Omaha, Omaha, NE, USA

3 Division of Neuroscience, Oregon National Primate Research Center, Oregon Health & Science University, Beaverton, OR, USA

4 Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX, USA

5 Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, USA

For all author emails, please log on.

BioData Mining 2010, 3:2  doi:10.1186/1756-0381-3-2

Published: 29 April 2010



Affymetrix GeneChips utilize 25-mer oligonucleotides probes linked to a silica surface to detect targets in solution. Mismatches due to single nucleotide polymorphisms (SNPs) can affect the hybridization between probes and targets. Previous research has indicated that binding between probes and targets strongly depends on the positions of these mismatches. However, there has been substantial variability in the effect of mismatch type across studies.


By taking advantage of naturally occurring mismatches between rhesus macaque transcripts and human probes from the Affymetrix U133 Plus 2 GeneChip, we collected the largest 25-mer probes dataset with single-base mismatches at each of the 25 positions on the probe ever used in this type of analysis.


A mismatch at the center of a probe led to a greater loss in signal intensity than a mismatch at the ends of the probe, regardless of the mismatch type. There was a slight asymmetry between the ends of a probe: effects of mismatches at the 3' end of a probe were greater than those at the 5' end. A cross study comparison of the effect of mismatch types revealed that results were not in good agreement among different reports. However, if the mismatch types were consolidated to purine or pyrimidine mismatches, cross study conclusions could be generated.


The comprehensive assessment of the effects of single-base mismatches on microarrays provided in this report can be useful for improving future versions of microarray platform design and the corresponding data analysis algorithms.