Difference between revisions of "ABCDDNA"

From GGBN Wiki
Jump to: navigation, search
(4. Mapping the DNA database)
 
(15 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
=DNA extension for ABCD - General information & installation manual=
 
=DNA extension for ABCD - General information & installation manual=
Another BioCASE wrapper on all connected databases had to be installed mandatory to offer DNA samples and its related specimen data on the central webportal (http://www.dnabank-network.org). Although the [http://wiki.tdwg.org/twiki/bin/view/ABCD/ ABCD 2.06 schema] (part of the BioCASE wrapper) is currently preferred its existing part for DNA ‘Sequences’ lacks important features. So, it shouldn’t be used any longer. The whole ABCD 2.06 schema is available [http://www.bgbm.org/tdwg/codata/schema/ABCD_2.06/ABCD_2.06.XSD here]. If you want to learn more about the meaning of BioCASe and ABCD please follow the links.
+
Currently only BioCASe and ABCD can be used to provide data via the DNA Bank Network.
 +
Additionally to the mapping of the specimens a second mapping with the BioCASE Provider Software (wrapper) on all connected databases had to be set up mandatory to offer DNA samples and its related specimen data on the central webportal (http://www.dnabank-network.org). Although the [http://wiki.tdwg.org/twiki/bin/view/ABCD/ ABCD 2.06 schema] (part of the BioCASE wrapper) is currently preferred its existing part for DNA ‘Sequences’ lacks important features. So, it shouldn’t be used any longer. The whole ABCD 2.06 schema is available [http://www.bgbm.org/tdwg/codata/schema/ABCD_2.06/ABCD_2.06.XSD here]. If you want to learn more about the meaning of BioCASE and ABCD please follow the links.
  
 
ABCD offers two options to add supplementary contents: [http://www.bgbm.org/tdwg/codata/schema/ABCD_2.06/HTML/ABCD_2.06.html#complexType_MeasurementOrFact_Link03076C80 ‘MeasurementsOrFacts’] and [http://www.bgbm.org/tdwg/codata/schema/ABCD_2.06/HTML/ABCD_2.06.html#element_UnitExtension_Link031A1C90 ‘UnitExtensions’]. Since the hierarchical structure of DNA specific features is too complex for ‘MeasurementsOrFacts’ we decided to use ‘UnitExtensions’ to integrate an xml schema definition for DNA data similar to the [http://www.geocase.eu/ ABCDEFG extension] for geosciences. That new DNA extension for ABCD 2.06 is called ABCDDNA.
 
ABCD offers two options to add supplementary contents: [http://www.bgbm.org/tdwg/codata/schema/ABCD_2.06/HTML/ABCD_2.06.html#complexType_MeasurementOrFact_Link03076C80 ‘MeasurementsOrFacts’] and [http://www.bgbm.org/tdwg/codata/schema/ABCD_2.06/HTML/ABCD_2.06.html#element_UnitExtension_Link031A1C90 ‘UnitExtensions’]. Since the hierarchical structure of DNA specific features is too complex for ‘MeasurementsOrFacts’ we decided to use ‘UnitExtensions’ to integrate an xml schema definition for DNA data similar to the [http://www.geocase.eu/ ABCDEFG extension] for geosciences. That new DNA extension for ABCD 2.06 is called ABCDDNA.
  
The DNA Sample means ABCD Unit and the identifier (Triple ID) for the related specimen is defined by ‘UnitAssociation’. Please read the following manual for using the schema within the BioCASe provider software and have a look at a [http://www.dnabank-network.org/Mapping.php mapping example].
+
The DNA Sample means ABCD Unit and the identifier (Triple ID) for the related specimen is defined by ‘UnitAssociation’. Please read the following manual for using the schema within the BioCASE provider software and have a look at a [http://www.dnabank-network.org/Mapping.php mapping example].
 +
 
 +
=TDWG standard=
 +
ABCDDNA has been proposed as a new standard to the TDWG committee (09/01/2010): http://www.tdwg.org/standards/640/
  
 
=Using ABCDDNA=
 
=Using ABCDDNA=
Line 10: Line 14:
  
  
-> View ABCDDNA schema: [http://www.dnabank-network.org/schemas/ABCDDNA/ABCDDNA.html html] or [http://www.dnabank-network.org/schemas/ABCDDNA/ABCDDNA.xml xml].
+
View ABCDDNA schema: [http://www.dnabank-network.org/schemas/ABCDDNA/ABCDDNA.html html] or [http://www.dnabank-network.org/schemas/ABCDDNA/ABCDDNA.xml xml].
-> View DNA part only: [http://www.dnabank-network.org/schemas/ABCDDNA/DNA.html html] or [http://www.dnabank-network.org/schemas/ABCDDNA/DNA.xml xml].
+
 
 +
View DNA part only: [http://www.dnabank-network.org/schemas/ABCDDNA/DNA.html html] or [http://www.dnabank-network.org/schemas/ABCDDNA/DNA.xml xml].
  
 
The following requirements have to be met to use the schema:
 
The following requirements have to be met to use the schema:
 
   
 
   
==1. Installation of BioCASe Provider Software (BPS)==
+
==1. Installation of BioCASE Provider Software (BPS)==
 
The installation is documented at the BPS Wiki: http://wiki.bgbm.org/bps/index.php/Main_Page.
 
The installation is documented at the BPS Wiki: http://wiki.bgbm.org/bps/index.php/Main_Page.
 
<div id="wikinote><span style="color:red;">'''Note:'''</span> ''If the software package DiGIR is already implemented on your specimen database it is possible to run both installations in parallel!''</div>
 
<div id="wikinote><span style="color:red;">'''Note:'''</span> ''If the software package DiGIR is already implemented on your specimen database it is possible to run both installations in parallel!''</div>
  
 
==2. ABCDDNA-Template==
 
==2. ABCDDNA-Template==
The current BioCASe Provider Software package (version 2.6.1, see [http://www.biocase.org/products/provider_software BPS download section]) includes the ABCDDNA template! If you are using an older version please upgrade your BPS!
+
The current BioCASE Provider Software package (version 2.6.1, see [http://www.biocase.org/products/provider_software BPS download section]) includes the ABCDDNA template! If you are using an older version please upgrade your BPS!
  
 
==3. Creating a new datasource==
 
==3. Creating a new datasource==
If you want to use the ABCDDNA schema for mapping you have to create a new datasource connection. You can use the [http://wiki.bgbm.org/bps/index.php/Main_Page BioCASe documentation] for assistance or [[Special:Contact | contact the DNA Bank Network team]].
+
If you want to use the ABCDDNA schema for mapping you have to create a new datasource connection. You can use the [http://wiki.bgbm.org/bps/index.php/Main_Page BioCASE documentation] for assistance or [[Special:Contact | contact the DNA Bank Network team]].
  
 
After you have created a new datasource you have to declare the database connection and the database structure.
 
After you have created a new datasource you have to declare the database connection and the database structure.
Line 41: Line 46:
 
<div id="wikinote"><span style="color:red;">'''Note:'''</span> Please don’t map DNA data only. To verify the related specimen it is absolutely '''essential''' to map both the '''identification part''' (Taxon Name) and the '''UnitAssociation'''. It is furthermore advantageous to map few gathering attributes such as CountryName or ISO-Code.</div>
 
<div id="wikinote"><span style="color:red;">'''Note:'''</span> Please don’t map DNA data only. To verify the related specimen it is absolutely '''essential''' to map both the '''identification part''' (Taxon Name) and the '''UnitAssociation'''. It is furthermore advantageous to map few gathering attributes such as CountryName or ISO-Code.</div>
  
'''UnitAssociation (Triple Identifier of the related specimen):'''
+
===UnitAssociation (Triple Identifier of the related specimen)===
 
[[File:UnitAssociation.png]]
 
[[File:UnitAssociation.png]]
  
 
'''Fig. 4. UnitAssociation in ABCD schema. All attributes are required. It has to be mapped similar to the Triple Identifier of the DNA sample.
 
'''Fig. 4. UnitAssociation in ABCD schema. All attributes are required. It has to be mapped similar to the Triple Identifier of the DNA sample.
 
'''
 
'''
 +
 
'''AssociatedUnitSourceInstitutionCode''' → Code or abbreviation of the institution the specimen/voucher of the DNA sample is deposited.
 
'''AssociatedUnitSourceInstitutionCode''' → Code or abbreviation of the institution the specimen/voucher of the DNA sample is deposited.
  
Line 55: Line 61:
  
 
These three listed attributes of the "UnitAssociation" and the Triple Identifier used in the original specimen database should have the same values. They can than be used as GUIDs ([http://en.wikipedia.org/wiki/Globally_Unique_Identifier Globally Unique Identifier]) since both values describe the same specimen.
 
These three listed attributes of the "UnitAssociation" and the Triple Identifier used in the original specimen database should have the same values. They can than be used as GUIDs ([http://en.wikipedia.org/wiki/Globally_Unique_Identifier Globally Unique Identifier]) since both values describe the same specimen.
 +
 +
<div id="frame">
 +
'''Example:'''
 +
 +
DNA sample taken from Ballota itegrifolia deposited in the "Herbarium Berolinense" at the Botanic Garden and Botanical Museum Berlin-Dahlem with Barcode number "B 10 0140204".
 +
 +
AssociatedUnitSourceInstitutionCode → BGBM
 +
 +
AssociatedUnitSourceName → Herbarium Berolinense
 +
 +
AssociatedUnitID → B 10 0140204
 +
 +
Comment → http://ww3.bgbm.org/biocase/pywrapper.cgi?dsa=Herbar
 +
 +
That Triple Identifier equals to the Triple Identifier used in the original specimen database ([http://search.biocase.org/europe/search/units/details/getDetails?institutionID=BGBM&collectionID=Herbarium+Berolinense&unitID=B+10+0140204&resourceKey=1095&bioKey=144840874 verify with BioCASE portal]).</div>
 +
 +
<div id="wikinote"><span style="color:red;">'''Note:'''</span> It is essential to map '''"AssociationType"''' describing the relation between a DNA sample and its voucher/specimen, e.g. "DNA from specimen", "DNA and specimen from the same population", "DNA from cultivated offspring of the specimen" or the like.</div>
 +
 +
<div id="wikinote"><span style="color:red;">'''Note:'''</span> Please pay attention to '''special characters''' like µ, ä, ö, ü, ß! They have to be translated into Unicode. E.g. a common mistake is to complete the field Unit@Concentration with “ng/µl” when '''“ng/ &#181;l”''' should be used instead.</div>
 +
 +
It is no problem if special characters are used in your database. But applied in the mapping scheme these characters constantly produce “fatal python errors”! To fix such a fatal mapping mistake you have to close the browser window and open it again. So, you will loose your changes if they are not saved. There’s no way to annul that error by going backwards in your browser!
 +
 +
'''Please have a look at a [[Mapping example]].'''
 +
 +
Don't forget to set the '''"Root table alias"''' and the '''"Static table alias"'''. The static table mostly equals the table for metadata and contains only one row.
 +
 +
Please save your mapping to finish the process!
 +
 +
==5. Test Mapping==
 +
Please press "Test Mapping!" to execute a capability test as well as a scan and search test for ABCD 2.06. If you get the message. You can start a test search if you receive the message "No errors found!" during all three tests. Therefore, click on "QueryForms" and select "ABCD2 search". The following code should than be visible:
 +
 +
<code>
 +
<filter>
 +
<like path='/DataSets/DataSet/Units/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/FullScientificNameString'>A*</like>
 +
</filter>
 +
</code>
 +
 +
Please press "Submit" (on the top of the page) to send the query to your database. You will hopefully see some results in xml. Otherwise you should check the search key (A* searches for taxa beginning with A.) or change the debugging level for more detailed error messages.
 +
 +
 +
[[Category:Terms documentations]]

Latest revision as of 12:21, 9 January 2023

DNA extension for ABCD - General information & installation manual

Currently only BioCASe and ABCD can be used to provide data via the DNA Bank Network. Additionally to the mapping of the specimens a second mapping with the BioCASE Provider Software (wrapper) on all connected databases had to be set up mandatory to offer DNA samples and its related specimen data on the central webportal (http://www.dnabank-network.org). Although the ABCD 2.06 schema (part of the BioCASE wrapper) is currently preferred its existing part for DNA ‘Sequences’ lacks important features. So, it shouldn’t be used any longer. The whole ABCD 2.06 schema is available here. If you want to learn more about the meaning of BioCASE and ABCD please follow the links.

ABCD offers two options to add supplementary contents: ‘MeasurementsOrFacts’ and ‘UnitExtensions’. Since the hierarchical structure of DNA specific features is too complex for ‘MeasurementsOrFacts’ we decided to use ‘UnitExtensions’ to integrate an xml schema definition for DNA data similar to the ABCDEFG extension for geosciences. That new DNA extension for ABCD 2.06 is called ABCDDNA.

The DNA Sample means ABCD Unit and the identifier (Triple ID) for the related specimen is defined by ‘UnitAssociation’. Please read the following manual for using the schema within the BioCASE provider software and have a look at a mapping example.

TDWG standard

ABCDDNA has been proposed as a new standard to the TDWG committee (09/01/2010): http://www.tdwg.org/standards/640/

Using ABCDDNA

The basic ABDC 2.06 version is remains unmodified. The new created DNA schema covers more than 76 elements such as “ExtractionDate”, “ExtractionMethod” for DNA extraction as well as an “Amplification-Container” for Sequences, GenBankNumbers, CloneStrain etc.


→ View ABCDDNA schema: html or xml.

→ View DNA part only: html or xml.

The following requirements have to be met to use the schema:

1. Installation of BioCASE Provider Software (BPS)

The installation is documented at the BPS Wiki: http://wiki.bgbm.org/bps/index.php/Main_Page.

Note: If the software package DiGIR is already implemented on your specimen database it is possible to run both installations in parallel!

2. ABCDDNA-Template

The current BioCASE Provider Software package (version 2.6.1, see BPS download section) includes the ABCDDNA template! If you are using an older version please upgrade your BPS!

3. Creating a new datasource

If you want to use the ABCDDNA schema for mapping you have to create a new datasource connection. You can use the BioCASE documentation for assistance or contact the DNA Bank Network team.

After you have created a new datasource you have to declare the database connection and the database structure. Then you can select a schema (ABCDDNA) and click on “Create”.

4. Mapping the DNA database

The first time you open the mapping page only few fields are visible. Select “Show all concepts” and press “Refresh” to see all available features.

Since the DNA sample means ABCD Unit the Triple Identifier (composed of UnitID, SourceInstitutionID and SourceID) has to map as follows:

UnitID → DNA Bank Number (unique DNA Number in your database but NOT the ID in the DNA table)

SourceInstitutionID → Code or abbreviation of the institution the DNA bank is located, e.g. BGBM for Botanic Garden and Botanical Museum Berlin-Dahlem

SourceID → Designation of the collection, e.g. DNA bank

Note: Please don’t map DNA data only. To verify the related specimen it is absolutely essential to map both the identification part (Taxon Name) and the UnitAssociation. It is furthermore advantageous to map few gathering attributes such as CountryName or ISO-Code.

UnitAssociation (Triple Identifier of the related specimen)

UnitAssociation.png

Fig. 4. UnitAssociation in ABCD schema. All attributes are required. It has to be mapped similar to the Triple Identifier of the DNA sample.

AssociatedUnitSourceInstitutionCode → Code or abbreviation of the institution the specimen/voucher of the DNA sample is deposited.

AssociatedUnitSourceName → Designation of the collection the specimen/voucher of the DNA sample is deposited.

AssociatedUnitID → Unique identifier/number of the specimen/voucher of the DNA sample within the specimen database.

Comment → WrapperURL of the specimen database.

These three listed attributes of the "UnitAssociation" and the Triple Identifier used in the original specimen database should have the same values. They can than be used as GUIDs (Globally Unique Identifier) since both values describe the same specimen.

Example:

DNA sample taken from Ballota itegrifolia deposited in the "Herbarium Berolinense" at the Botanic Garden and Botanical Museum Berlin-Dahlem with Barcode number "B 10 0140204".

AssociatedUnitSourceInstitutionCode → BGBM

AssociatedUnitSourceName → Herbarium Berolinense

AssociatedUnitID → B 10 0140204

Comment → http://ww3.bgbm.org/biocase/pywrapper.cgi?dsa=Herbar

That Triple Identifier equals to the Triple Identifier used in the original specimen database (verify with BioCASE portal).
Note: It is essential to map "AssociationType" describing the relation between a DNA sample and its voucher/specimen, e.g. "DNA from specimen", "DNA and specimen from the same population", "DNA from cultivated offspring of the specimen" or the like.
Note: Please pay attention to special characters like µ, ä, ö, ü, ß! They have to be translated into Unicode. E.g. a common mistake is to complete the field Unit@Concentration with “ng/µl” when “ng/ µl” should be used instead.

It is no problem if special characters are used in your database. But applied in the mapping scheme these characters constantly produce “fatal python errors”! To fix such a fatal mapping mistake you have to close the browser window and open it again. So, you will loose your changes if they are not saved. There’s no way to annul that error by going backwards in your browser!

Please have a look at a Mapping example.

Don't forget to set the "Root table alias" and the "Static table alias". The static table mostly equals the table for metadata and contains only one row.

Please save your mapping to finish the process!

5. Test Mapping

Please press "Test Mapping!" to execute a capability test as well as a scan and search test for ABCD 2.06. If you get the message. You can start a test search if you receive the message "No errors found!" during all three tests. Therefore, click on "QueryForms" and select "ABCD2 search". The following code should than be visible:

<filter> <like path='/DataSets/DataSet/Units/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/FullScientificNameString'>A*</like> </filter>

Please press "Submit" (on the top of the page) to send the query to your database. You will hopefully see some results in xml. Otherwise you should check the search key (A* searches for taxa beginning with A.) or change the debugging level for more detailed error messages.