Email updates

Keep up to date with the latest news and content from BioData Mining and BioMed Central.

Open Access Research

Applications and methods utilizing the Simple Semantic Web Architecture and Protocol (SSWAP) for bioinformatics resource discovery and disparate data and service integration

Rex T Nelson1, Shulamit Avraham2, Randy C Shoemaker1, Gregory D May3, Doreen Ware24 and Damian DG Gessler5*

Author Affiliations

1 USDA-ARS, CICGR, 100 Osborne Dr. Rm. 1575, Ames, IA, 50011-1010 USA

2 Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA

3 National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM 87505, USA

4 USDA-ARS, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA

5 University of Arizona, 1657 E. Helen St., Tucson, AZ 85721, USA

For all author emails, please log on.

BioData Mining 2010, 3:3  doi:10.1186/1756-0381-3-3

Published: 4 June 2010

Abstract

Background

Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of data between information resources difficult and labor intensive. A recently described semantic web protocol, the Simple Semantic Web Architecture and Protocol (SSWAP; pronounced "swap") offers the ability to describe data and services in a semantically meaningful way. We report how three major information resources (Gramene, SoyBase and the Legume Information System [LIS]) used SSWAP to semantically describe selected data and web services.

Methods

We selected high-priority Quantitative Trait Locus (QTL), genomic mapping, trait, phenotypic, and sequence data and associated services such as BLAST for publication, data retrieval, and service invocation via semantic web services. Data and services were mapped to concepts and categories as implemented in legacy and de novo community ontologies. We used SSWAP to express these offerings in OWL Web Ontology Language (OWL), Resource Description Framework (RDF) and eXtensible Markup Language (XML) documents, which are appropriate for their semantic discovery and retrieval. We implemented SSWAP services to respond to web queries and return data. These services are registered with the SSWAP Discovery Server and are available for semantic discovery at http://sswap.info webcite.

Results

A total of ten services delivering QTL information from Gramene were created. From SoyBase, we created six services delivering information about soybean QTLs, and seven services delivering genetic locus information. For LIS we constructed three services, two of which allow the retrieval of DNA and RNA FASTA sequences with the third service providing nucleic acid sequence comparison capability (BLAST).

Conclusions

The need for semantic integration technologies has preceded available solutions. We report the feasibility of mapping high priority data from local, independent, idiosyncratic data schemas to common shared concepts as implemented in web-accessible ontologies. These mappings are then amenable for use in semantic web services. Our implementation of approximately two dozen services means that biological data at three large information resources (Gramene, SoyBase, and LIS) is available for programmatic access, semantic searching, and enhanced interaction between the separate missions of these resources.