Data Portal Architecture

From GGBN Wiki
Revision as of 16:55, 26 August 2011 by WikiSysop (talk | contribs) (Level 3)
Jump to: navigation, search

Level 1

Within the DNA Bank Network specimen data are recalled by the same wrappers which are used by GBIF.

The data architecture of the DNA Bank Network is based on the GBIF infrastructure. The basic principle of GBIF as well as of the DNA Bank Network is to record all data sets only once. Stored at only one place they can be used as a linked reference for different applications.

Since many institutions joined GBIF applying different database structures each, the installation of wrappers has become a standard to combine different sources and integrate data easily into networks. There are three main wrapper softwares available BioCASE, DiGIR and TAPIR. All of them use a xml schema for data transfer: BioCASE - ABCD, DiGIR and TAPIR - DarwinCore(DwC).

Level 2

For DNA data management an open source software was developed at the BGBM the "DNA Module". Furthermore it is possible to use its own database system.

The DNA Module is one of key components of the networks database system. To find related specimen data of a DNA sample the module sends a query to the respective specimen database via BioCASE or DiGIR. A copy including few specimen attributes is as well stored in the DNA cache (speed up queries). By following the BioCASe and DiGIR protocol it is so possible to connect any GBIF compliant specimen database worldwide.

The DNA Module is currently used by three of the four project partners associated with their own specimen databases. The DSMZ in Braunschweig applies its own system for DNA data input.

Level 3

To transfer DNA data into the webportal of the DNA Bank Network an DNA extension for ABCD was developed. Thus BioCASE Provider Software is required.

Another BioCASE wrapper using the new DNA extension for ABCD has been installed on all three DNA Modules and the database in Braunschweig separately to offer all DNA samples and its related specimen data on the central webportal.

The source code of the DNA Bank Network's Webportal is available under Mozilla Public License Version 1.1 at http://ww2.biocase.org/svn/dnabank/DNA_Bank_Network/webportal/

Dataflow-Grafik.jpg

General data architectur of the DNA Bank Network. Specimen and DNA sample databases (on top and middle) are operated by the Network partners. Their data content is structured and transferred to the shared web portal (black and green arrows) by wrappers (BioCASe, DiGIR, grey boxes). Publications and online accessible DNA sequence data (blue arrows) can be linked to the related DNA sample. The Catalogue of Life checklist is used as search backbone in the Web portal (red arrow).