Skip to contents

What is Darwin Core?

Darwin Core can best be thought of as a set of terms for describing information about biodiversity data that are given special significance so that their meaning is comparable between different data sets. Every data set in the ecological science is unique and every one will have some features about it that will require some explanation when providing this data to other parties. The Darwin Core standard is a way of describing features of your data set using a controlled vocabulary such that when you talk about feature ‘x’ in your data set, that this ‘x’ can be understood to mean a particular well-defined and standardised thing. Darwin Core is not the only standard used for the description of biological data, there are plenty of others and many, including Darwin Core itself, are maintained by working groups associated with TDWG a Biodiversity Information Standards voluntary organization (formally called the Taxonomic Databases Working Group). Darwin Core is the standard that forms the basis for the Global Biodiversity Information Facility and, for many researchers, GBIF will represent the most common way to make their data findable and sharable with the wider community.

Darwin Core Archive Files

The Darwin Core standard itself should not be confused with Darwin Core archive files. Darwin Core archive files are a file format type that bundles together data and the metadata describing said data into one package. The data in these archive files are described according to Darwin Core standards (hence the name) and, as a result, represent a convenient standards-compliant format for sharing biodiversity data according to FAIR principles. Darwin Core archive files are the preferred method of publishing data in GBIF and a full how-to guide to producing Darwin core files can be found on the GBIF Integrated Publishing Toolkit website. However, this document will demonstrate ways to both generate your own Darwin Core archive files an existing data set or importing data from an existing Darwin core archive file using an alternative solution provided by the Living Norway R package.

Darwin core archive files are simply a ZIP folder with a specific set of files inside:

  • Metafile The metafile is a file that describes what files exist in the Dawin Core archive file and how the columns in the data files map to Darwin Core terms. This file format follows an XML schema described in the Darwin Core text guide. Whilst it is possible to create this file yourself, there are a number of packages that can help users construct this file. The most commonly used program is the Integrated Publishing Toolkit (IPT). Another way to automatically generate the metafile is through the use of the Living Norway package for the R statistical platform when processing your data sets. Later in this document we will guide you through the process of creating a Darwin Core archive file from a set of data tables.

  • Resource metadata This file contains the metadata about the data set but structured according to known standards. For example, it may include things such as a description of the purpose of the study, the sampling methodology used in the study, and license under which the data can be shared. The format of this file is a flavour of XML, known as Ecological Metadata Language (EML), developed specifically for the handling of metadata in ecological data sets. As with the metafile, there are a number of packages that can be used to help you produce syntactically valid EML. When publishing data through the IPT, there are a number of user-friendly prompts to provide the information for the creation of the resource metadata. Later in this document we will demsonstrate how to generate valid EML from a markdown document that can also serve as a data paper to document the data set.

  • Data files These are a collection of files that contain the data. The format of this data is expected to be in a text-based delimited format such as comma-seperated values (CSV) or tab-seperated values. Information such as the encoding of the text and the type of delimiter used in the data files is often stored in the metafile. The first line of the data files may contain column headings.

    • Core data file Every Darwin Core archive needs exactly one core data file. This is the main data table of the dataset and serves as the definition of the reference sampling unit for the data set. One column of the core data file must be an ID column with each row of that column containing a unique ID code for the record (unique to the dataset): this ID code will serve as a reference code for any extension data files included in the archive. Data being published to GBIF can have core data files that belong to one the following types:

      • Sample event data corresponds to data that have been collected according to a defined protocol or experimental design at particular times or locations. Here the data set is built around a series of sampling events at which measurements are recorded.
      • Occurrence data correspoinds to sightings or records of species (or other taxon level) that can been assigned to a location or time period. This data type is best used when the data set is a series of sightings or records from a collection strategy that is not systemic or unknown. Observations from citizen science projects or from digitized museum records often fit this data type.
      • Species checklist data corresponds to a list of named species (or other taxon level) that represent a catolgue for some purpose. This could for example be a list of species of particular conservation for a particular area or a list of potential invasive species.
    • Extension data files A Darwin Core archive file may optionally have any number of data tables that may contain additional information to extend the information in the core data file. For example, in a Darwin Core archive with an event-based core data file, we might also have an accompanying data table that describes the number of individuals of each species found at each of the sampling events. Like the core data file, each extension data table must contain one column that represents an ID column. Unlike the core data table type however, the extension ID columns refer to the IDs in the core data table that the extension data table rows are providing information about. The ID values in the ID column of the extension data tables do not therefore need to be unique because multiple rows in an extension table may be referring to a single entry in the core data table. In the example described above, multiple species may have been recorded at the same sampling event, and the extension data table will therefore contain multiple rows (one for each of the species recorded) each containig the ID for the relevant sampling event in the ID column of the extension data table. For data being submitted to publication to GBIF, extension data tables must belong to one of the supported extension types.

To make it easier to both create and manipulate the contents of Darwin Core archive files, the Living Norway package provides a number of functions and tools for data managers and researchers. The aim of the package and this documentation is to provide a more approachable interface to producing Darwin Core archives without requiring extensive knowledge of both the Darwin Core standard and EML such that the dissemination of data to meet FAIR standards can be more easily integrated into a researcher’s workflow.

Structure of a Darwin Core archive file

The Living Norway R Package

Before we can use any of the functions contained within the Living Norway package we must first install it. At the current time the easiest way to install the Living Norway package is to import the devtools package and use the “install_github” function to install the package directly from the project’s GitHub repository. We hope to distribute future releases of the Living Norway package over CRAN and, once this is achieved, then installation of the Living Norway package will follow that standard R package installation procedure. In the meantime, the following code will install the necessary packages:

# Install the Living Norway package from the Git repository
#devtools::install_github("https://github.com/LivingNorway/LivingNorwayR")
# Import the tools into R
library(LivingNorwayR)

Once the Living Norway package is installed and loaded, a number of classes are added to R that allow for the easier manipulation of Darwin Core archive data and the terms associated with them. What do we mean by ‘classes’? All variables in R belong to a particular class of object. You will already be familiar with some R’s base classes that can used for the handling of information such as data frames, lists, and vectors. The Living Norway package simply contains more class definitions that allow for easier manipulation of the information in Darwin Core archive files:

  • DwCTerm Is a class that contains information about terms used in the Darwin Core standard. Nearly all users of the Living Norway package will never need to create their own terms objects and, for the most part, only interact with terms objects that have been pre-defined from their description in the Darwin Core standard.
  • DwCGeneric Is a class that the contains the information in the Darwin Core archive data tables but also defines the link between columns in data tables to registered Darwin Core terms. Most users will not use this class directly but will use one of the derived classes that are specialised for particular data table types supported by GBIF. The naming convention for these classes is ‘GBIF’ followed by the GBIF class name (for example, the GBIF implementation of the Event core data table class type is provided by the ‘GBIFEvent’ class).
  • DwCMetadata This is a class for handling metadata relating the dataset. It allows for the easier import and export of EML files and also allows for the creation of EML files from other file formats such as R markdown files. This will described in greater detail later in this document.
  • DwCArchive This class contains an object of a DwCGeneric-dervied class for each of the data tables in a Darwin Core archive (one core file plus any number of extension tables) in addition to a DwCMetadata object containing the metadata describing the data tables. The class contains a number of helper functions for importing and exporting data from Darwin Core archive files.

The event core table type is supported by the ‘GBIFEvent’ class in the Living Norway package. Similarly the the occurrence core table type is supported by the ‘GBIFOccurrence’ class and the species checklist core table type is supported by the ‘GBIFTaxon’ class. The ‘getGBIFCoreClasses()’ returns a full list of the GBIF core classes handled by the Living Norway R package along with their definition information (represented as ‘DwCTerms’ objects):

## $GBIFEvent
## http://rs.tdwg.org/dwc/terms/Event - Event
## An action that occurs at some location during some time.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://rs.tdwg.org/dwc/terms/Event
##  Version IRI: http://rs.tdwg.org/dwc/terms/version/Event-2018-09-06
##  Type: Class
##  Date modified: 2018-09-06
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2014-10-26_15
##  Examples:
##      A specimen collection process. A camera trap image capture.  A marine trawl.
##  Miscellaneous information:
##      DataSets/DataSet/Units/Unit/Gathering
## 
## $GBIFOccurrence
## http://rs.tdwg.org/dwc/terms/Occurrence - Occurrence
## An existence of an Organism (sensu http://rs.tdwg.org/dwc/terms/Organism) at a particular place at a particular time.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://rs.tdwg.org/dwc/terms/Occurrence
##  Version IRI: http://rs.tdwg.org/dwc/terms/version/Occurrence-2020-08-20
##  Type: Class
##  Date modified: 2020-08-20
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2014-10-26_15
##  Examples:
##      A wolf pack on the shore of Kluane Lake in 1988. A virus in a plant leaf in the New York Botanical Garden at 15:29 on 2014-10-23. A fungus in Central Park in the summer of 1929.
##  Miscellaneous information:
##      DataSets/DataSet/Units/Unit
## 
## $GBIFTaxon
## http://rs.tdwg.org/dwc/terms/Taxon - Taxon
## A group of organisms (sensu http://purl.obolibrary.org/obo/OBI_0100026) considered by taxonomists to form a homogeneous unit.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://rs.tdwg.org/dwc/terms/Taxon
##  Version IRI: http://rs.tdwg.org/dwc/terms/version/Taxon-2018-09-06
##  Type: Class
##  Date modified: 2018-09-06
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2014-10-26_15
##  Examples:
##      The genus Truncorotaloides as published by Brönnimann et al. in 1953 in the Journal of Paleontology Vol. 27(6) p. 817-820.
##  Miscellaneous information:
##      no simple equivalent in ABCD

The Living Norway packages also supports a large number of classes to handle the broad types of data tables that can be used as extension tables in Darwin Core archive files submitted to GBIF. The full list of extensions possible to use for GBIF-compliant Darwin Core archive files can be found on the Darwin Core Archive Validator website. The names of the Living Norway classes that can handle these extensions can be found by calling the ‘getGBIFExtensionClasses()’ function.

##  [1] "GBIFMultimedia"                "GBIFChronometricAge"          
##  [3] "GBIFIdentification"            "GBIFMeasurementOrFact"        
##  [5] "GBIFResourceRelationship"      "GBIFEOLMediaExtension"        
##  [7] "GBIFEOLReferencesExtension"    "GBIFDescription"              
##  [9] "GBIFDistribution"              "GBIFdnaDerivedData"           
## [11] "GBIFIdentifier"                "GBIFImage"                    
## [13] "GBIFMultimedia"                "GBIFReference"                
## [15] "GBIFReleve"                    "GBIFSpeciesProfile"           
## [17] "GBIFTypesAndSpecimen"          "GBIFVernacularName"           
## [19] "GBIFGermplasmAccession"        "GBIFMeasurementScore"         
## [21] "GBIFMeasurementTrait"          "GBIFMeasurementTrial"         
## [23] "GBIFAmplification"             "GBIFCloning"                  
## [25] "GBIFGelImage"                  "GBIFLoan"                     
## [27] "GBIFMaterialSample"            "GBIFPermit"                   
## [29] "GBIFPreparation"               "GBIFPreservation"             
## [31] "GBIFGermplasmSample"           "GBIFExtendedMeasurementOrFact"
## [33] "GBIFChronometricAge"           "GBIFChronometricDate"

Importing a Darwin Core Archive File

In order to import a Darwin Core archive file we need to first get hold of a Darwin Core archive file for a dataset that we wish to import. Typically one can download these files manually from using biodiversity database indexing facilities such as GBIF. For this example we will use an example dataset from the “Extensive monitoring of breeding birds” program (TOV-E), for which the Darwin Core archive file is housed on the IPT server at the Norwegian Institute for Nature Research. The overview for this dataset can be found on the GBIF dataset portal. To minimise the number of files needed to be distributed as part of this exercise, we can use the R code below to download and store the Darwin Core archive file in a temporary location.

# Create a temporary directory to store intermediate files used in this workshop
tempDirLoc <- tempdir()
# The URL where the Darwin Core file for the TOV-E bird survey data is housed
datasetURL <- "https://ipt.nina.no/archive.do?r=tove_birdsampling"
# Download the Darwin Core file to the temporary directory
localDataLoc <- file.path(tempDirLoc, "TOVEData.zip")
download.file(datasetURL, localDataLoc, mode = "wb")

Now that we have the Darwin Core archive file stored locally, we can now import it using the “initializeDwCArchive” function in the Living Norway package. We do this by calling the ‘initializeDwCArchive’ function. This function can be called in one of two different ways: it’s first argument can be a location of a Darwin Core archive file (with an optional second argument being a default file encoding for importing the data) and is the easiest way for importing data that already exist as Darwin Core archive, or it can be called using Darwin Core tables constructed using the Living Norway package for the times when you are using the package to construct your own Darwin Core archives. This latter way to call the function will be covered in the later section on archive file creation.

# Create a DwCArchive object from the downloaded Darwin Core archive file
TOVEArchive <- initializeDwCArchive(localDataLoc, "UTF-8")

In the code block above we specify the file UTF-8 file encoding. By default the encoding will be set to your system and, in most cases will not need to be changed from these defaults. In this example the metadata files in the archive contain a number of Norwegian characters that may be incorrectly imported if we use the default values of your system.

Now that the archive has been imported into a DwCArchive object then we can have a look at a summary of the contents:

# Print a summary of the data in the archive object
TOVEArchive
## METADATA
## 
## Title: TOV-E Bird monitoring sampling data
## Creators: John Atle Kålås (Norwegian Institute for Nature Research), Ingar Jostein Øien (Birdlife Norway), Bård Stokke (Norsk institutt for naturforskning), Roald Vang (Norwegian Institute for Nature Research)
## Abstract: Data from the project "Extensive monitoring of breeding birds (TOV-E)" from 2006 up until today. The project is carried out in cooperation between BirdLife Norway, Norwegian Institute for Nature Research (NINA) and the Norwegian Environment Agency, and is the most important project for monitoring population trends for Norwegian bird species on land.
## 
## 
## CORE TABLE
## 
## Table name: event | ID column: 1 - "id" | Table class: Event
##                                                            columnIndex
## http://purl.org/dc/terms/type                                        2
## http://purl.org/dc/terms/modified                                    3
## http://rs.tdwg.org/dwc/terms/datasetName                             4
## http://rs.tdwg.org/dwc/terms/ownerInstitutionCode                    5
## http://rs.tdwg.org/dwc/terms/informationWithheld                     6
## http://rs.tdwg.org/dwc/terms/dataGeneralizations                     7
## http://rs.tdwg.org/dwc/terms/eventID                                 8
## http://rs.tdwg.org/dwc/terms/samplingProtocol                        9
## http://rs.tdwg.org/dwc/terms/sampleSizeValue                        10
## http://rs.tdwg.org/dwc/terms/sampleSizeUnit                         11
## http://rs.tdwg.org/dwc/terms/samplingEffort                         12
## http://rs.tdwg.org/dwc/terms/eventDate                              13
## http://rs.tdwg.org/dwc/terms/eventTime                              14
## http://rs.tdwg.org/dwc/terms/year                                   15
## http://rs.tdwg.org/dwc/terms/month                                  16
## http://rs.tdwg.org/dwc/terms/day                                    17
## http://rs.tdwg.org/dwc/terms/locationID                             18
## http://rs.tdwg.org/dwc/terms/country                                19
## http://rs.tdwg.org/dwc/terms/countryCode                            20
## http://rs.tdwg.org/dwc/terms/stateProvince                          21
## http://rs.tdwg.org/dwc/terms/municipality                           22
## http://rs.tdwg.org/dwc/terms/locality                               23
## http://rs.tdwg.org/dwc/terms/minimumElevationInMeters               24
## http://rs.tdwg.org/dwc/terms/maximumElevationInMeters               25
## http://rs.tdwg.org/dwc/terms/decimalLatitude                        26
## http://rs.tdwg.org/dwc/terms/decimalLongitude                       27
## http://rs.tdwg.org/dwc/terms/geodeticDatum                          28
## http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters          29
##                                                                               columnName
## http://purl.org/dc/terms/type                                                       type
## http://purl.org/dc/terms/modified                                               modified
## http://rs.tdwg.org/dwc/terms/datasetName                                     datasetName
## http://rs.tdwg.org/dwc/terms/ownerInstitutionCode                   ownerInstitutionCode
## http://rs.tdwg.org/dwc/terms/informationWithheld                     informationWithheld
## http://rs.tdwg.org/dwc/terms/dataGeneralizations                     dataGeneralizations
## http://rs.tdwg.org/dwc/terms/eventID                                             eventID
## http://rs.tdwg.org/dwc/terms/samplingProtocol                           samplingProtocol
## http://rs.tdwg.org/dwc/terms/sampleSizeValue                             sampleSizeValue
## http://rs.tdwg.org/dwc/terms/sampleSizeUnit                               sampleSizeUnit
## http://rs.tdwg.org/dwc/terms/samplingEffort                               samplingEffort
## http://rs.tdwg.org/dwc/terms/eventDate                                         eventDate
## http://rs.tdwg.org/dwc/terms/eventTime                                         eventTime
## http://rs.tdwg.org/dwc/terms/year                                                   year
## http://rs.tdwg.org/dwc/terms/month                                                 month
## http://rs.tdwg.org/dwc/terms/day                                                     day
## http://rs.tdwg.org/dwc/terms/locationID                                       locationID
## http://rs.tdwg.org/dwc/terms/country                                             country
## http://rs.tdwg.org/dwc/terms/countryCode                                     countryCode
## http://rs.tdwg.org/dwc/terms/stateProvince                                 stateProvince
## http://rs.tdwg.org/dwc/terms/municipality                                   municipality
## http://rs.tdwg.org/dwc/terms/locality                                           locality
## http://rs.tdwg.org/dwc/terms/minimumElevationInMeters           minimumElevationInMeters
## http://rs.tdwg.org/dwc/terms/maximumElevationInMeters           maximumElevationInMeters
## http://rs.tdwg.org/dwc/terms/decimalLatitude                             decimalLatitude
## http://rs.tdwg.org/dwc/terms/decimalLongitude                           decimalLongitude
## http://rs.tdwg.org/dwc/terms/geodeticDatum                                 geodeticDatum
## http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters coordinateUncertaintyInMeters
##                                     id  type                modified
## 1 7B888854-72DF-41AF-A09D-4AC963D2B7D2 Event 2012-04-12T15:03:57.710
## 2 E8E6C2D0-CD3C-435E-BB0A-EE521FDEED73 Event     2008-05-30T00:00:00
## 3 C936B129-AEDF-4F15-B394-A67A048AC12B Event 2010-11-11T21:43:04.873
## 4 46775141-2D2F-4F8D-91FD-7D8328EC0D55 Event 2011-05-22T11:02:16.047
## 5 86CB0FA8-23DC-44E2-8130-834EC7EAABAA Event 2013-01-11T16:54:48.147
## 6 AA006F91-EA3C-4382-A30C-56575245A1C2 Event 2013-08-23T09:02:49.753
##             datasetName ownerInstitutionCode
## 1 TOV-E Bird monitoring             NINA/NOF
## 2 TOV-E Bird monitoring             NINA/NOF
## 3 TOV-E Bird monitoring             NINA/NOF
## 4 TOV-E Bird monitoring             NINA/NOF
## 5 TOV-E Bird monitoring             NINA/NOF
## 6 TOV-E Bird monitoring             NINA/NOF
##                                                                                        informationWithheld
## 1 Exact location withheld to protect long-term study. Observers identity withheld due to GDPR restrictions
## 2 Exact location withheld to protect long-term study. Observers identity withheld due to GDPR restrictions
## 3 Exact location withheld to protect long-term study. Observers identity withheld due to GDPR restrictions
## 4 Exact location withheld to protect long-term study. Observers identity withheld due to GDPR restrictions
## 5 Exact location withheld to protect long-term study. Observers identity withheld due to GDPR restrictions
## 6 Exact location withheld to protect long-term study. Observers identity withheld due to GDPR restrictions
##                                       dataGeneralizations
## 1 Location laid out randomly within 5km from centre point
## 2 Location laid out randomly within 5km from centre point
## 3 Location laid out randomly within 5km from centre point
## 4 Location laid out randomly within 5km from centre point
## 5 Location laid out randomly within 5km from centre point
## 6 Location laid out randomly within 5km from centre point
##                                eventID
## 1 7B888854-72DF-41AF-A09D-4AC963D2B7D2
## 2 E8E6C2D0-CD3C-435E-BB0A-EE521FDEED73
## 3 C936B129-AEDF-4F15-B394-A67A048AC12B
## 4 46775141-2D2F-4F8D-91FD-7D8328EC0D55
## 5 86CB0FA8-23DC-44E2-8130-834EC7EAABAA
## 6 AA006F91-EA3C-4382-A30C-56575245A1C2
##                                                   samplingProtocol
## 1 https://tov-e.nina.no/Fugl/public/papirskjema/MethodologyEng.pdf
## 2 https://tov-e.nina.no/Fugl/public/papirskjema/MethodologyEng.pdf
## 3 https://tov-e.nina.no/Fugl/public/papirskjema/MethodologyEng.pdf
## 4 https://tov-e.nina.no/Fugl/public/papirskjema/MethodologyEng.pdf
## 5 https://tov-e.nina.no/Fugl/public/papirskjema/MethodologyEng.pdf
## 6 https://tov-e.nina.no/Fugl/public/papirskjema/MethodologyEng.pdf
##   sampleSizeValue sampleSizeUnit                 samplingEffort  eventDate
## 1            5366          metre 255 minutes/17 sampling points 2007-06-03
## 2            5366          metre 245 minutes/17 sampling points 2008-05-30
## 3            5366          metre 240 minutes/17 sampling points 2010-05-29
## 4            5366          metre 270 minutes/17 sampling points 2011-05-22
## 5            5366          metre 330 minutes/17 sampling points 2012-06-02
## 6            5366          metre 225 minutes/17 sampling points 2013-06-02
##     eventTime year month day locationID country countryCode stateProvince
## 1  3:55/08:10 2007     6   3        101  Norway          NO         Viken
## 2 04:05/08:10 2008     5  30        101  Norway          NO         Viken
## 3 05:00/09:00 2010     5  29        101  Norway          NO         Viken
## 4 04:00/08:30 2011     5  22        101  Norway          NO         Viken
## 5  4:30/10:00 2012     6   2        101  Norway          NO         Viken
## 6 04:15/08:00 2013     6   2        101  Norway          NO         Viken
##   municipality    locality minimumElevationInMeters maximumElevationInMeters
## 1       Halden Skogskroken                      190                      220
## 2       Halden Skogskroken                      190                      220
## 3       Halden Skogskroken                      190                      220
## 4       Halden Skogskroken                      190                      220
## 5       Halden Skogskroken                      190                      220
## 6       Halden Skogskroken                      190                      220
##   decimalLatitude decimalLongitude geodeticDatum coordinateUncertaintyInMeters
## 1        58.91206         11.62238     EPSG:4326                          5000
## 2        58.91206         11.62238     EPSG:4326                          5000
## 3        58.91206         11.62238     EPSG:4326                          5000
## 4        58.91206         11.62238     EPSG:4326                          5000
## 5        58.91206         11.62238     EPSG:4326                          5000
## 6        58.91206         11.62238     EPSG:4326                          5000
## 
## 
## EXTENSION TABLES
## 
## Table name: occurrence | ID column: 1 - "id" | Table class: Occurrence
##                                                   columnIndex
## http://purl.org/dc/terms/type                               2
## http://purl.org/dc/terms/modified                           3
## http://rs.tdwg.org/dwc/terms/collectionCode                 4
## http://rs.tdwg.org/dwc/terms/basisOfRecord                  5
## http://rs.tdwg.org/dwc/terms/occurrenceID                   6
## http://rs.tdwg.org/dwc/terms/catalogNumber                  7
## http://rs.tdwg.org/dwc/terms/occurrenceRemarks              8
## http://rs.tdwg.org/dwc/terms/organismQuantity               9
## http://rs.tdwg.org/dwc/terms/organismQuantityType          10
## http://rs.tdwg.org/dwc/terms/occurrenceStatus              11
## http://rs.tdwg.org/dwc/terms/eventID                       12
## http://rs.tdwg.org/dwc/terms/scientificName                13
## http://rs.tdwg.org/dwc/terms/kingdom                       14
## http://rs.tdwg.org/dwc/terms/phylum                        15
## http://rs.tdwg.org/dwc/terms/class                         16
## http://rs.tdwg.org/dwc/terms/order                         17
## http://rs.tdwg.org/dwc/terms/family                        18
## http://rs.tdwg.org/dwc/terms/genus                         19
## http://rs.tdwg.org/dwc/terms/specificEpithet               20
## http://rs.tdwg.org/dwc/terms/vernacularName                21
##                                                             columnName
## http://purl.org/dc/terms/type                                     type
## http://purl.org/dc/terms/modified                             modified
## http://rs.tdwg.org/dwc/terms/collectionCode             collectionCode
## http://rs.tdwg.org/dwc/terms/basisOfRecord               basisOfRecord
## http://rs.tdwg.org/dwc/terms/occurrenceID                 occurrenceID
## http://rs.tdwg.org/dwc/terms/catalogNumber               catalogNumber
## http://rs.tdwg.org/dwc/terms/occurrenceRemarks       occurrenceRemarks
## http://rs.tdwg.org/dwc/terms/organismQuantity         organismQuantity
## http://rs.tdwg.org/dwc/terms/organismQuantityType organismQuantityType
## http://rs.tdwg.org/dwc/terms/occurrenceStatus         occurrenceStatus
## http://rs.tdwg.org/dwc/terms/eventID                           eventID
## http://rs.tdwg.org/dwc/terms/scientificName             scientificName
## http://rs.tdwg.org/dwc/terms/kingdom                           kingdom
## http://rs.tdwg.org/dwc/terms/phylum                             phylum
## http://rs.tdwg.org/dwc/terms/class                               class
## http://rs.tdwg.org/dwc/terms/order                               order
## http://rs.tdwg.org/dwc/terms/family                             family
## http://rs.tdwg.org/dwc/terms/genus                               genus
## http://rs.tdwg.org/dwc/terms/specificEpithet           specificEpithet
## http://rs.tdwg.org/dwc/terms/vernacularName             vernacularName
##                                     id       type                modified
## 1 B5538DE6-437D-4B61-9CB0-8AE7CAF4D5A4 Occurrence 2012-06-19T23:33:30.053
## 2 BC7AB7CC-B9D9-433B-95E3-1609CD305E35 Occurrence 2013-07-08T22:06:26.513
## 3 29E73B38-950D-4EE1-A0B6-8443E7C219D2 Occurrence 2018-06-05T17:52:53.957
## 4 87416FAD-8400-48BF-B523-0BD0DF22ACDC Occurrence 2019-08-02T10:49:23.403
## 5 8640104C-3DA0-45BB-B6FE-C9522D58CEAA Occurrence 2021-10-20T22:45:15.517
## 6 860EEAB7-71E5-455B-9A7D-C332AD49E093 Occurrence 2021-08-16T15:50:41.977
##                                 collectionCode    basisOfRecord
## 1 TOV - Extensive monitoring of bird in Norway HumanObservation
## 2 TOV - Extensive monitoring of bird in Norway HumanObservation
## 3 TOV - Extensive monitoring of bird in Norway HumanObservation
## 4 TOV - Extensive monitoring of bird in Norway HumanObservation
## 5 TOV - Extensive monitoring of bird in Norway HumanObservation
## 6 TOV - Extensive monitoring of bird in Norway HumanObservation
##                        occurrenceID catalogNumber
## 1 urn:catalog:NINA:TOV-E:2012180410    2012180410
## 2 urn:catalog:NINA:TOV-E:2013180410    2013180410
## 3 urn:catalog:NINA:TOV-E:2018180410    2018180410
## 4 urn:catalog:NINA:TOV-E:2019180410    2019180410
## 5 urn:catalog:NINA:TOV-E:2020180410    2020180410
## 6 urn:catalog:NINA:TOV-E:2021180410    2021180410
##                                                                occurrenceRemarks
## 1 aggregated count for 15 sampling points situated less than 5000m from location
## 2 aggregated count for 15 sampling points situated less than 5000m from location
## 3 aggregated count for 15 sampling points situated less than 5000m from location
## 4 aggregated count for 15 sampling points situated less than 5000m from location
## 5 aggregated count for 15 sampling points situated less than 5000m from location
## 6 aggregated count for 15 sampling points situated less than 5000m from location
##   organismQuantity organismQuantityType occurrenceStatus
## 1                1                pairs          PRESENT
## 2                1                pairs          PRESENT
## 3                2                pairs          PRESENT
## 4                2                pairs          PRESENT
## 5                3                pairs          PRESENT
## 6                1                pairs          PRESENT
##                                eventID scientificName  kingdom   phylum class
## 1 B5538DE6-437D-4B61-9CB0-8AE7CAF4D5A4 Gavia stellata Animalia Chordata  Aves
## 2 BC7AB7CC-B9D9-433B-95E3-1609CD305E35 Gavia stellata Animalia Chordata  Aves
## 3 29E73B38-950D-4EE1-A0B6-8443E7C219D2 Gavia stellata Animalia Chordata  Aves
## 4 87416FAD-8400-48BF-B523-0BD0DF22ACDC Gavia stellata Animalia Chordata  Aves
## 5 8640104C-3DA0-45BB-B6FE-C9522D58CEAA Gavia stellata Animalia Chordata  Aves
## 6 860EEAB7-71E5-455B-9A7D-C332AD49E093 Gavia stellata Animalia Chordata  Aves
##         order   family genus specificEpithet vernacularName
## 1 Gaviiformes Gaviidae Gavia        stellata         Smålom
## 2 Gaviiformes Gaviidae Gavia        stellata         Smålom
## 3 Gaviiformes Gaviidae Gavia        stellata         Smålom
## 4 Gaviiformes Gaviidae Gavia        stellata         Smålom
## 5 Gaviiformes Gaviidae Gavia        stellata         Smålom
## 6 Gaviiformes Gaviidae Gavia        stellata         Smålom

The top section of the output gives a brief summary of the metadata associated with the project. What follows the metadata is a summary of each of the data tables contained in the archive: firstly the core data table followed by each extension data table in the archive. The first line of the summary of each data table gives the name of the file in the archive (in this case the core file was called ‘event’), followed by the column that serves as an ID column for the data table, and finally the class of data table (in the case of this example the core data table is of type ‘Event’ which corresponds to the GBIF event core category of archive files). Below this line is a list of each of the columns in the data table that have been linked to a standardised definition such as Darwin Core. These are columns that have a meaning that corresponds to a definition determined by a standardisation committee and for which the definition is publically accessible. Each row of this definition summary table starts with the location where the term definition can be found (in most cases an Internationalized Resource Identifier) followed by the column number and, if the data table has column names, the name of the column corresponding to this column number. Below the definition list is a snippet of the data contained in the data table (usually just the first six rows).

We can select out the core data table from the Darwin Core archive and even retrieve it as a data frame. Extracting information about objects can be done by using the R6 ‘method functions’ defined for the classes defining those objects. Under this object model method functions are used by using the ‘$’ notation after the object you want to call the method function for, followed by the name of the method function you want to call. For example, the DwCArchive class has the method function ‘getCoreTable’ that allows the user to extract the core table from the archive object. Once a particular data table is extracted then the user can use the method functions defined by the DwCGeneric class that allows for further manipulation of the individual data tables.

# Retrieve the core table from the archive object
TOVEEventTable <- TOVEArchive$getCoreTable()
# TOVEEvent table is an object of type GBIFEvent which is derived from DwCGeneric
class(TOVEEventTable)
## [1] "GBIFEvent"  "DwCGeneric" "R6"
# Export the contents of the event table to a data frame
TOVEEventTableDF <- TOVEEventTable$exportAsDataFrame()
# Lets look at the top few rows of the data frame extracted from the core data table
head(TOVEEventTableDF)
##                                     id  type                modified
## 1 7B888854-72DF-41AF-A09D-4AC963D2B7D2 Event 2012-04-12T15:03:57.710
## 2 E8E6C2D0-CD3C-435E-BB0A-EE521FDEED73 Event     2008-05-30T00:00:00
## 3 C936B129-AEDF-4F15-B394-A67A048AC12B Event 2010-11-11T21:43:04.873
## 4 46775141-2D2F-4F8D-91FD-7D8328EC0D55 Event 2011-05-22T11:02:16.047
## 5 86CB0FA8-23DC-44E2-8130-834EC7EAABAA Event 2013-01-11T16:54:48.147
## 6 AA006F91-EA3C-4382-A30C-56575245A1C2 Event 2013-08-23T09:02:49.753
##             datasetName ownerInstitutionCode
## 1 TOV-E Bird monitoring             NINA/NOF
## 2 TOV-E Bird monitoring             NINA/NOF
## 3 TOV-E Bird monitoring             NINA/NOF
## 4 TOV-E Bird monitoring             NINA/NOF
## 5 TOV-E Bird monitoring             NINA/NOF
## 6 TOV-E Bird monitoring             NINA/NOF
##                                                                                        informationWithheld
## 1 Exact location withheld to protect long-term study. Observers identity withheld due to GDPR restrictions
## 2 Exact location withheld to protect long-term study. Observers identity withheld due to GDPR restrictions
## 3 Exact location withheld to protect long-term study. Observers identity withheld due to GDPR restrictions
## 4 Exact location withheld to protect long-term study. Observers identity withheld due to GDPR restrictions
## 5 Exact location withheld to protect long-term study. Observers identity withheld due to GDPR restrictions
## 6 Exact location withheld to protect long-term study. Observers identity withheld due to GDPR restrictions
##                                       dataGeneralizations
## 1 Location laid out randomly within 5km from centre point
## 2 Location laid out randomly within 5km from centre point
## 3 Location laid out randomly within 5km from centre point
## 4 Location laid out randomly within 5km from centre point
## 5 Location laid out randomly within 5km from centre point
## 6 Location laid out randomly within 5km from centre point
##                                eventID
## 1 7B888854-72DF-41AF-A09D-4AC963D2B7D2
## 2 E8E6C2D0-CD3C-435E-BB0A-EE521FDEED73
## 3 C936B129-AEDF-4F15-B394-A67A048AC12B
## 4 46775141-2D2F-4F8D-91FD-7D8328EC0D55
## 5 86CB0FA8-23DC-44E2-8130-834EC7EAABAA
## 6 AA006F91-EA3C-4382-A30C-56575245A1C2
##                                                   samplingProtocol
## 1 https://tov-e.nina.no/Fugl/public/papirskjema/MethodologyEng.pdf
## 2 https://tov-e.nina.no/Fugl/public/papirskjema/MethodologyEng.pdf
## 3 https://tov-e.nina.no/Fugl/public/papirskjema/MethodologyEng.pdf
## 4 https://tov-e.nina.no/Fugl/public/papirskjema/MethodologyEng.pdf
## 5 https://tov-e.nina.no/Fugl/public/papirskjema/MethodologyEng.pdf
## 6 https://tov-e.nina.no/Fugl/public/papirskjema/MethodologyEng.pdf
##   sampleSizeValue sampleSizeUnit                 samplingEffort  eventDate
## 1            5366          metre 255 minutes/17 sampling points 2007-06-03
## 2            5366          metre 245 minutes/17 sampling points 2008-05-30
## 3            5366          metre 240 minutes/17 sampling points 2010-05-29
## 4            5366          metre 270 minutes/17 sampling points 2011-05-22
## 5            5366          metre 330 minutes/17 sampling points 2012-06-02
## 6            5366          metre 225 minutes/17 sampling points 2013-06-02
##     eventTime year month day locationID country countryCode stateProvince
## 1  3:55/08:10 2007     6   3        101  Norway          NO         Viken
## 2 04:05/08:10 2008     5  30        101  Norway          NO         Viken
## 3 05:00/09:00 2010     5  29        101  Norway          NO         Viken
## 4 04:00/08:30 2011     5  22        101  Norway          NO         Viken
## 5  4:30/10:00 2012     6   2        101  Norway          NO         Viken
## 6 04:15/08:00 2013     6   2        101  Norway          NO         Viken
##   municipality    locality minimumElevationInMeters maximumElevationInMeters
## 1       Halden Skogskroken                      190                      220
## 2       Halden Skogskroken                      190                      220
## 3       Halden Skogskroken                      190                      220
## 4       Halden Skogskroken                      190                      220
## 5       Halden Skogskroken                      190                      220
## 6       Halden Skogskroken                      190                      220
##   decimalLatitude decimalLongitude geodeticDatum coordinateUncertaintyInMeters
## 1        58.91206         11.62238     EPSG:4326                          5000
## 2        58.91206         11.62238     EPSG:4326                          5000
## 3        58.91206         11.62238     EPSG:4326                          5000
## 4        58.91206         11.62238     EPSG:4326                          5000
## 5        58.91206         11.62238     EPSG:4326                          5000
## 6        58.91206         11.62238     EPSG:4326                          5000

Similarly, extension tables can be extracted from the archive by using the ‘getExtensionTables’ method function of the DwCArchive class. This function has one argument that is either an integer vector containing the indeces of the extension tables (in the order that they are displayed in the summary of the archive file) or a character vector giving the names of the tables in the archive file (those names are found in the first summary line of each data table when displaying the archive summary). This function returns a list of the data tables requested. Therefore to extract just the one we want, we’ll need to further index the list with the ‘[[1]]’ notation to extract the first element of the returned list.

# Retrieve the extension table from the archive object: two ways to do this
# 1. Retrieve the extension table by using its index
TOVEOccTable <- TOVEArchive$getExtensionTables(1)[[1]]
# 2. Retrieve the extension table by using its name
TOVEOccTable <- TOVEArchive$getExtensionTables("occurrence")[[1]]
# The getExtensionTables functions returns a list of the data tables that are requested.  Therefore to extract just the first element of this list we need
# to use the extra '[[1]]' list extraction notation.
# TOVEOccTable is an object of type GBIFOccurrence which is dervied from DwCGeneric
class(TOVEOccTable)
## [1] "GBIFOccurrence" "DwCGeneric"     "R6"
# Export the contents of the occurrence table to a data frame
TOVEOccTableDF <- TOVEOccTable$exportAsDataFrame()
# Lets look at the top few rows of the data frame extracted from the extension data table
head(TOVEOccTableDF)
##                                     id       type                modified
## 1 B5538DE6-437D-4B61-9CB0-8AE7CAF4D5A4 Occurrence 2012-06-19T23:33:30.053
## 2 BC7AB7CC-B9D9-433B-95E3-1609CD305E35 Occurrence 2013-07-08T22:06:26.513
## 3 29E73B38-950D-4EE1-A0B6-8443E7C219D2 Occurrence 2018-06-05T17:52:53.957
## 4 87416FAD-8400-48BF-B523-0BD0DF22ACDC Occurrence 2019-08-02T10:49:23.403
## 5 8640104C-3DA0-45BB-B6FE-C9522D58CEAA Occurrence 2021-10-20T22:45:15.517
## 6 860EEAB7-71E5-455B-9A7D-C332AD49E093 Occurrence 2021-08-16T15:50:41.977
##                                 collectionCode    basisOfRecord
## 1 TOV - Extensive monitoring of bird in Norway HumanObservation
## 2 TOV - Extensive monitoring of bird in Norway HumanObservation
## 3 TOV - Extensive monitoring of bird in Norway HumanObservation
## 4 TOV - Extensive monitoring of bird in Norway HumanObservation
## 5 TOV - Extensive monitoring of bird in Norway HumanObservation
## 6 TOV - Extensive monitoring of bird in Norway HumanObservation
##                        occurrenceID catalogNumber
## 1 urn:catalog:NINA:TOV-E:2012180410    2012180410
## 2 urn:catalog:NINA:TOV-E:2013180410    2013180410
## 3 urn:catalog:NINA:TOV-E:2018180410    2018180410
## 4 urn:catalog:NINA:TOV-E:2019180410    2019180410
## 5 urn:catalog:NINA:TOV-E:2020180410    2020180410
## 6 urn:catalog:NINA:TOV-E:2021180410    2021180410
##                                                                occurrenceRemarks
## 1 aggregated count for 15 sampling points situated less than 5000m from location
## 2 aggregated count for 15 sampling points situated less than 5000m from location
## 3 aggregated count for 15 sampling points situated less than 5000m from location
## 4 aggregated count for 15 sampling points situated less than 5000m from location
## 5 aggregated count for 15 sampling points situated less than 5000m from location
## 6 aggregated count for 15 sampling points situated less than 5000m from location
##   organismQuantity organismQuantityType occurrenceStatus
## 1                1                pairs          PRESENT
## 2                1                pairs          PRESENT
## 3                2                pairs          PRESENT
## 4                2                pairs          PRESENT
## 5                3                pairs          PRESENT
## 6                1                pairs          PRESENT
##                                eventID scientificName  kingdom   phylum class
## 1 B5538DE6-437D-4B61-9CB0-8AE7CAF4D5A4 Gavia stellata Animalia Chordata  Aves
## 2 BC7AB7CC-B9D9-433B-95E3-1609CD305E35 Gavia stellata Animalia Chordata  Aves
## 3 29E73B38-950D-4EE1-A0B6-8443E7C219D2 Gavia stellata Animalia Chordata  Aves
## 4 87416FAD-8400-48BF-B523-0BD0DF22ACDC Gavia stellata Animalia Chordata  Aves
## 5 8640104C-3DA0-45BB-B6FE-C9522D58CEAA Gavia stellata Animalia Chordata  Aves
## 6 860EEAB7-71E5-455B-9A7D-C332AD49E093 Gavia stellata Animalia Chordata  Aves
##         order   family genus specificEpithet vernacularName
## 1 Gaviiformes Gaviidae Gavia        stellata         Smålom
## 2 Gaviiformes Gaviidae Gavia        stellata         Smålom
## 3 Gaviiformes Gaviidae Gavia        stellata         Smålom
## 4 Gaviiformes Gaviidae Gavia        stellata         Smålom
## 5 Gaviiformes Gaviidae Gavia        stellata         Smålom
## 6 Gaviiformes Gaviidae Gavia        stellata         Smålom

It is possible to extract elements from the archive metadata. The ‘getMetadata’ method of the DwCArchive class returns a DwCMetadata object. From this object it is possible to access elements of the metadata.

# Retrieve the metadata from the archive object
TOVEMetadata <- TOVEArchive$getMetadata()
# Retrieve the title of the data set
TOVEMetadata$getTitle()
## [1] "TOV-E Bird monitoring sampling data"
# Retrieve the abstract/summary of the data set
TOVEMetadata$getAbstract()
## [1] "Data from the project \"Extensive monitoring of breeding birds (TOV-E)\" from 2006 up until today. The project is carried out in cooperation between BirdLife Norway, Norwegian Institute for Nature Research (NINA) and the Norwegian Environment Agency, and is the most important project for monitoring population trends for Norwegian bird species on land."
# Retrieve the information on the data set creators
TOVEMetadata$getCreatorInfo()
## [[1]]
## [[1]]$individualName
## [[1]]$individualName$givenName
## [1] "John Atle"
## 
## [[1]]$individualName$surName
## [1] "Kålås"
## 
## 
## [[1]]$organizationName
## [1] "Norwegian Institute for Nature Research"
## 
## [[1]]$positionName
## [1] "Senior Researcher"
## 
## [[1]]$address
## [[1]]$address$deliveryPoint
## [1] "Postboks 5685 Sluppen"
## 
## [[1]]$address$city
## [1] "Trondheim"
## 
## [[1]]$address$postalCode
## [1] "7485"
## 
## [[1]]$address$country
## [1] "NO"
## 
## 
## [[1]]$electronicMailAddress
## [1] "john.kalas@nina.no"
## 
## [[1]]$userId
## [1] "0000-0002-2126-0261"
## 
## 
## [[2]]
## [[2]]$individualName
## [[2]]$individualName$givenName
## [1] "Ingar Jostein"
## 
## [[2]]$individualName$surName
## [1] "Øien"
## 
## 
## [[2]]$organizationName
## [1] "Birdlife Norway"
## 
## [[2]]$positionName
## [1] "Fagsjef"
## 
## [[2]]$address
## [[2]]$address$deliveryPoint
## [1] "Sandgata 30 B"
## 
## [[2]]$address$city
## [1] "Trondheim"
## 
## [[2]]$address$postalCode
## [1] "7012"
## 
## [[2]]$address$country
## [1] "NO"
## 
## 
## [[2]]$electronicMailAddress
## [1] "ingar@birdlife.no"
## 
## 
## [[3]]
## [[3]]$individualName
## [[3]]$individualName$givenName
## [1] "Bård"
## 
## [[3]]$individualName$surName
## [1] "Stokke"
## 
## 
## [[3]]$organizationName
## [1] "Norsk institutt for naturforskning"
## 
## [[3]]$positionName
## [1] "Senior Researcher"
## 
## [[3]]$address
## [[3]]$address$deliveryPoint
## [1] "Postboks 5685 Sluppen"
## 
## [[3]]$address$city
## [1] "Trondheim"
## 
## [[3]]$address$postalCode
## [1] "7485"
## 
## [[3]]$address$country
## [1] "NO"
## 
## 
## [[3]]$phone
## [1] "91138256"
## 
## [[3]]$electronicMailAddress
## [1] "bard.stokke@nina.no"
## 
## [[3]]$userId
## [1] "0000-0001-5589-6738"
## 
## 
## [[4]]
## [[4]]$individualName
## [[4]]$individualName$givenName
## [1] "Roald"
## 
## [[4]]$individualName$surName
## [1] "Vang"
## 
## 
## [[4]]$organizationName
## [1] "Norwegian Institute for Nature Research"
## 
## [[4]]$positionName
## [1] "Data manager"
## 
## [[4]]$address
## [[4]]$address$deliveryPoint
## [1] "Postboks 5685 Sluppen"
## 
## [[4]]$address$city
## [1] "Trondheim"
## 
## [[4]]$address$postalCode
## [1] "7485"
## 
## [[4]]$address$country
## [1] "NO"
## 
## 
## [[4]]$electronicMailAddress
## [1] "roald.vang@nina.no"
## 
## [[4]]$userId
## [1] "0000-0002-4006-8689"

Creating a Darwin Core Archive File

Creating the Data Tables

So far we’ve talked about how to import data being provided as a Darwin Core archive file but a key feature of the Living Norway package is to make it easier for researchers to go from their data (stored as data frames) to a complete Darwin Core archive. To demonstrate this, we use the data frames created that we extracted in the code blocks above and use this as a starting point for creating a Darwin core archive file. For most researchers this is a common starting point as their data sets are often represented by a collection of data tables that are either already data frame or that can be easily read into data frames. So, from this data frame starting point, we have two tables: ‘TOVEEventTableDF’ and ‘TOVEOccTableDF’. Our first decision, is to designate one of the tables as the core data table. In this instance we already know that it is ‘TOVEEventDF’ but, in the case where you are processing your own data sets, the core data table is always the one that the sampling unit is based around. For most ecological experimental designs, the table denoting the sampling events (handled in the Living Norway package by the GBIFEvent class) is a natural core table. Under this format, each sampling event and its details, such as location and time of sampling, is described in the core table. Often what was recorded at each sampling event is described in the extension files. In situations where the only thing measured at each sampling event is the occurrence of a species then the table describing these occurrences would be designated as the core data table (corresponding to the GBIFOccurrence class in the Living Norway package). In some situations the dataset may not be based around occurrences or sampling events but simply a check list of species (or other taxonomic groups=. This latter case can be handled by using a core data table that lists these taxa and is handled in the Living Norway package with the ‘GBIFTaxon’ class.

So we’ve decided that the core data table should be the ‘TOVEEventTableDF’ data frame and that this data table is an event-based core. To initialise a GBIF-complaint event table we can use the ‘initializeGBIFEvent’ function. This function requires two arguments: the data frame making up the table and a column (given as either a column name or column number) to represent the ID information. If the table is going to be used as a core table then this ID column must contain unique values for each row and will serve as a key to link extension data tables to the core table. After these two mandatory arguments are given then the user must specify how the columns (if any) relate to definitions in the Darwin Core standard for that data table type. This can be done in one of two ways: either the data frame can have column names that correspond to Darwin Core terms relevant to the data table type and then the user can simply add the argument ‘nameAutoMap = TRUE’ to the initialisation function and it will look for any column names that correspond to Darwin Core terms, or the user can add arguments with names corresponding to each Darwin Core term and set that argument either as a column name or column number. The ‘getGBIFEventMembers’ function returns a list of all the Darwin Core terms associated with GBIF-compliant event data tables.

# Look at the Darwin Core terms associated with event data tables (here we've shortened it to the first 6 entries so that the output is not too long)
getGBIFEventMembers()[1:6]
## $`http://purl.org/dc/terms/type`
## http://purl.org/dc/terms/type - Type (DEPRECATED)
## The nature or genre of the resource.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://purl.org/dc/terms/type
##  Type: Property
##  Date modified: 2008-01-14
##  Is replaced by: http://purl.org/dc/elements/1.1/type
##  Notes:
##      To provide a string literal value for type, use dc:type rather than this term. In accordance with the Darwin Core RDF guide, rdf:type should be used instead of this term to indicate an IRI value for type.
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2009-12-07_1
##      http://rs.tdwg.org/decisions/decision-2019-12-01_19
##      http://rs.tdwg.org/decisions/decision-2019-12-01_20
##  Miscellaneous information:
##      This term is deprecated and should no longer be used.
##      not in ABCD
## 
## $`http://purl.org/dc/terms/modified`
## http://purl.org/dc/terms/modified - Date Modified
## The most recent date-time on which the resource was changed.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://purl.org/dc/terms/modified
##  Version IRI: http://dublincore.org/usage/terms/history/#modified-003
##  Type: Property
##  Date modified: 2020-08-12
##  Notes:
##      Recommended best practice is to use a date that conforms to ISO 8601-1:2019.
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2019-12-01_19
##  Examples:
##      1963-03-08T14:07-0600 (8 Mar 1963 at 2:07pm in the time zone six hours earlier than UTC). 2009-02-20T08:40Z (20 February 2009 8:40am UTC). 2018-08-29T15:19 (3:19pm local time on 29 August 2018). 1809-02-12 (some time during 12 February 1809). 1906-06 (some time in June 1906). 1971 (some time in the year 1971). 2007-03-01T13:00:00Z/2008-05-11T15:30:00Z (some time during the interval between 1 March 2007 1pm UTC and 11 May 2008 3:30pm UTC). 1900/1909 (some time during the interval between the beginning of the year 1900 and the end of the year 1909). 2007-11-13/15 (some time in the interval between 13 November 2007 and 15 November 2007).
##  Miscellaneous information:
##      not in ABCD
## 
## $`http://purl.org/dc/terms/language`
## http://purl.org/dc/terms/language - Language
## A language of the resource.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://purl.org/dc/terms/language
##  Version IRI: http://dublincore.org/usage/terms/history/#languageT-001
##  Type: Property
##  Date modified: 2008-01-14
##  Notes:
##      Recommended best practice is to use an IRI from the Library of Congress ISO 639-2 scheme http://id.loc.gov/vocabulary/iso639-2
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2019-12-01_19
##  Miscellaneous information:
##      not in ABCD
## 
## $`http://purl.org/dc/terms/license`
## http://purl.org/dc/terms/license - License
## A legal document giving official permission to do something with the resource.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://purl.org/dc/terms/license
##  Version IRI: http://dublincore.org/usage/terms/history/#license-002
##  Type: Property
##  Date modified: 2008-01-14
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2014-11-06_17
##  Examples:
##      http://creativecommons.org/publicdomain/zero/1.0/legalcode, http://creativecommons.org/licenses/by/4.0/legalcode
##  Miscellaneous information:
##      not in ABCD
## 
## $`http://purl.org/dc/terms/rightsHolder`
## http://purl.org/dc/terms/rightsHolder - Rights Holder
## A person or organization owning or managing rights over the resource.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://purl.org/dc/terms/rightsHolder
##  Version IRI: http://dublincore.org/usage/terms/history/#rightsHolder-002
##  Type: Property
##  Date modified: 2008-01-14
##  Examples:
##      The Regents of the University of California
##  Miscellaneous information:
##      not in ABCD
## 
## $`http://purl.org/dc/terms/accessRights`
## http://purl.org/dc/terms/accessRights - Access Rights
## Information about who can access the resource or an indication of its security status.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://purl.org/dc/terms/accessRights
##  Version IRI: http://dublincore.org/usage/terms/history/#accessRights-002
##  Type: Property
##  Date modified: 2008-01-14
##  Notes:
##      Access Rights may include information regarding access or restrictions based on privacy, security, or other policies.
##  Examples:
##      not-for-profit use only, https://www.fieldmuseum.org/field-museum-natural-history-conditions-and-suggested-norms-use-collections-data-and-images
##  Miscellaneous information:
##      not in ABCD
# Call the initialisation function using the two different methods:
# 1. Using the automatic mapping method
newTOVEEventTable <- initializeGBIFEvent(TOVEEventTableDF, "id", nameAutoMap = TRUE)
# 2. Using the manual mapping method
newTOVEEventTable <- initializeGBIFEvent(TOVEEventTableDF, "id",
  # What follows is a list of arguments giving Darwin Core terms followed by the column name (or number)
  # in the data frame that corresponds to those terms.  In this example it doesn't make much sense to
  # call the initialisation function in this manner because all the column names correspond to
  # Darwin terms anyway (so much easier to use the first method).  However, this alternative method
  # can be useful if the column names are different from the Darwin Core terms that they represent or
  # for data frame that don't have column names (in which case column numbers can be given as the
  # argument values instead).
  type = "type",
  modified = "modified",
  datasetName = "datasetName",
  ownerInstitutionCode = "ownerInstitutionCode",
  informationWithheld = "informationWithheld",
  dataGeneralizations = "dataGeneralizations",
  eventID = "eventID",
  samplingProtocol = "samplingProtocol",
  sampleSizeValue = "sampleSizeValue",
  sampleSizeUnit = "sampleSizeUnit",
  samplingEffort = "samplingEffort",
  eventDate = "eventDate",
  eventTime = "eventTime",
  year = "year",
  month = "month",
  day = "day",
  locationID = "locationID",
  country = "country",
  countryCode = "countryCode",
  stateProvince = "stateProvince",
  municipality = "municipality",
  locality = "locality",
  minimumElevationInMeters = "minimumElevationInMeters",
  maximumElevationInMeters = "maximumElevationInMeters",
  decimalLatitude = "decimalLatitude",
  decimalLongitude = "decimalLongitude",
  geodeticDatum = "geodeticDatum",
  coordinateUncertaintyInMeters = "coordinateUncertaintyInMeters")
# We can then check to see if the terms are mapped correctly
# Terms with NA in the column index represent Darwin Core terms associated with the event data table that are not mapped
# Not all terms need to be mapped to make a valid Darwin Core archive
newTOVEEventTable$getTermMapping()
##                                                                  columnIndex
## http://purl.org/dc/terms/type                                              2
## http://purl.org/dc/terms/modified                                          3
## http://purl.org/dc/terms/language                                         NA
## http://purl.org/dc/terms/license                                          NA
## http://purl.org/dc/terms/rightsHolder                                     NA
## http://purl.org/dc/terms/accessRights                                     NA
## http://purl.org/dc/terms/bibliographicCitation                            NA
## http://purl.org/dc/terms/references                                       NA
## http://rs.tdwg.org/dwc/terms/institutionID                                NA
## http://rs.tdwg.org/dwc/terms/datasetID                                    NA
## http://rs.tdwg.org/dwc/terms/institutionCode                              NA
## http://rs.tdwg.org/dwc/terms/datasetName                                   4
## http://rs.tdwg.org/dwc/terms/ownerInstitutionCode                          5
## http://rs.tdwg.org/dwc/terms/informationWithheld                           6
## http://rs.tdwg.org/dwc/terms/dataGeneralizations                           7
## http://rs.tdwg.org/dwc/terms/dynamicProperties                            NA
## http://rs.tdwg.org/dwc/terms/eventID                                       8
## http://rs.tdwg.org/dwc/terms/parentEventID                                NA
## http://rs.tdwg.org/dwc/terms/samplingProtocol                              9
## http://rs.tdwg.org/dwc/terms/sampleSizeValue                              10
## http://rs.tdwg.org/dwc/terms/sampleSizeUnit                               11
## http://rs.tdwg.org/dwc/terms/samplingEffort                               12
## http://rs.tdwg.org/dwc/terms/eventDate                                    13
## http://rs.tdwg.org/dwc/terms/eventTime                                    14
## http://rs.tdwg.org/dwc/terms/startDayOfYear                               NA
## http://rs.tdwg.org/dwc/terms/endDayOfYear                                 NA
## http://rs.tdwg.org/dwc/terms/year                                         15
## http://rs.tdwg.org/dwc/terms/month                                        16
## http://rs.tdwg.org/dwc/terms/day                                          17
## http://rs.tdwg.org/dwc/terms/verbatimEventDate                            NA
## http://rs.tdwg.org/dwc/terms/habitat                                      NA
## http://rs.tdwg.org/dwc/terms/fieldNumber                                  NA
## http://rs.tdwg.org/dwc/terms/fieldNotes                                   NA
## http://rs.tdwg.org/dwc/terms/eventRemarks                                 NA
## http://rs.tdwg.org/dwc/terms/locationID                                   18
## http://rs.tdwg.org/dwc/terms/higherGeographyID                            NA
## http://rs.tdwg.org/dwc/terms/higherGeography                              NA
## http://rs.tdwg.org/dwc/terms/continent                                    NA
## http://rs.tdwg.org/dwc/terms/waterBody                                    NA
## http://rs.tdwg.org/dwc/terms/islandGroup                                  NA
## http://rs.tdwg.org/dwc/terms/island                                       NA
## http://rs.tdwg.org/dwc/terms/country                                      19
## http://rs.tdwg.org/dwc/terms/countryCode                                  20
## http://rs.tdwg.org/dwc/terms/stateProvince                                21
## http://rs.tdwg.org/dwc/terms/county                                       NA
## http://rs.tdwg.org/dwc/terms/municipality                                 22
## http://rs.tdwg.org/dwc/terms/locality                                     23
## http://rs.tdwg.org/dwc/terms/verbatimLocality                             NA
## http://rs.tdwg.org/dwc/terms/verbatimElevation                            NA
## http://rs.tdwg.org/dwc/terms/minimumElevationInMeters                     24
## http://rs.tdwg.org/dwc/terms/maximumElevationInMeters                     25
## http://rs.tdwg.org/dwc/terms/verbatimDepth                                NA
## http://rs.tdwg.org/dwc/terms/minimumDepthInMeters                         NA
## http://rs.tdwg.org/dwc/terms/maximumDepthInMeters                         NA
## http://rs.tdwg.org/dwc/terms/minimumDistanceAboveSurfaceInMeters          NA
## http://rs.tdwg.org/dwc/terms/maximumDistanceAboveSurfaceInMeters          NA
## http://rs.tdwg.org/dwc/terms/locationAccordingTo                          NA
## http://rs.tdwg.org/dwc/terms/locationRemarks                              NA
## http://rs.tdwg.org/dwc/terms/verbatimCoordinates                          NA
## http://rs.tdwg.org/dwc/terms/verbatimLatitude                             NA
## http://rs.tdwg.org/dwc/terms/verbatimLongitude                            NA
## http://rs.tdwg.org/dwc/terms/verbatimCoordinateSystem                     NA
## http://rs.tdwg.org/dwc/terms/verbatimSRS                                  NA
## http://rs.tdwg.org/dwc/terms/decimalLatitude                              26
## http://rs.tdwg.org/dwc/terms/decimalLongitude                             27
## http://rs.tdwg.org/dwc/terms/geodeticDatum                                28
## http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters                29
## http://rs.tdwg.org/dwc/terms/coordinatePrecision                          NA
## http://rs.tdwg.org/dwc/terms/pointRadiusSpatialFit                        NA
## http://rs.tdwg.org/dwc/terms/footprintWKT                                 NA
## http://rs.tdwg.org/dwc/terms/footprintSRS                                 NA
## http://rs.tdwg.org/dwc/terms/footprintSpatialFit                          NA
## http://rs.tdwg.org/dwc/terms/georeferencedBy                              NA
## http://rs.tdwg.org/dwc/terms/georeferencedDate                            NA
## http://rs.tdwg.org/dwc/terms/georeferenceProtocol                         NA
## http://rs.tdwg.org/dwc/terms/georeferenceSources                          NA
## http://rs.tdwg.org/dwc/terms/georeferenceVerificationStatus               NA
## http://rs.tdwg.org/dwc/terms/georeferenceRemarks                          NA
## http://rs.tdwg.org/dwc/terms/geologicalContextID                          NA
## http://rs.tdwg.org/dwc/terms/earliestEonOrLowestEonothem                  NA
## http://rs.tdwg.org/dwc/terms/latestEonOrHighestEonothem                   NA
## http://rs.tdwg.org/dwc/terms/earliestEraOrLowestErathem                   NA
## http://rs.tdwg.org/dwc/terms/latestEraOrHighestErathem                    NA
## http://rs.tdwg.org/dwc/terms/earliestPeriodOrLowestSystem                 NA
## http://rs.tdwg.org/dwc/terms/latestPeriodOrHighestSystem                  NA
## http://rs.tdwg.org/dwc/terms/earliestEpochOrLowestSeries                  NA
## http://rs.tdwg.org/dwc/terms/latestEpochOrHighestSeries                   NA
## http://rs.tdwg.org/dwc/terms/earliestAgeOrLowestStage                     NA
## http://rs.tdwg.org/dwc/terms/latestAgeOrHighestStage                      NA
## http://rs.tdwg.org/dwc/terms/lowestBiostratigraphicZone                   NA
## http://rs.tdwg.org/dwc/terms/highestBiostratigraphicZone                  NA
## http://rs.tdwg.org/dwc/terms/lithostratigraphicTerms                      NA
## http://rs.tdwg.org/dwc/terms/group                                        NA
## http://rs.tdwg.org/dwc/terms/formation                                    NA
## http://rs.tdwg.org/dwc/terms/member                                       NA
## http://rs.tdwg.org/dwc/terms/bed                                          NA
##                                                                                     columnName
## http://purl.org/dc/terms/type                                                             type
## http://purl.org/dc/terms/modified                                                     modified
## http://purl.org/dc/terms/language                                                         <NA>
## http://purl.org/dc/terms/license                                                          <NA>
## http://purl.org/dc/terms/rightsHolder                                                     <NA>
## http://purl.org/dc/terms/accessRights                                                     <NA>
## http://purl.org/dc/terms/bibliographicCitation                                            <NA>
## http://purl.org/dc/terms/references                                                       <NA>
## http://rs.tdwg.org/dwc/terms/institutionID                                                <NA>
## http://rs.tdwg.org/dwc/terms/datasetID                                                    <NA>
## http://rs.tdwg.org/dwc/terms/institutionCode                                              <NA>
## http://rs.tdwg.org/dwc/terms/datasetName                                           datasetName
## http://rs.tdwg.org/dwc/terms/ownerInstitutionCode                         ownerInstitutionCode
## http://rs.tdwg.org/dwc/terms/informationWithheld                           informationWithheld
## http://rs.tdwg.org/dwc/terms/dataGeneralizations                           dataGeneralizations
## http://rs.tdwg.org/dwc/terms/dynamicProperties                                            <NA>
## http://rs.tdwg.org/dwc/terms/eventID                                                   eventID
## http://rs.tdwg.org/dwc/terms/parentEventID                                                <NA>
## http://rs.tdwg.org/dwc/terms/samplingProtocol                                 samplingProtocol
## http://rs.tdwg.org/dwc/terms/sampleSizeValue                                   sampleSizeValue
## http://rs.tdwg.org/dwc/terms/sampleSizeUnit                                     sampleSizeUnit
## http://rs.tdwg.org/dwc/terms/samplingEffort                                     samplingEffort
## http://rs.tdwg.org/dwc/terms/eventDate                                               eventDate
## http://rs.tdwg.org/dwc/terms/eventTime                                               eventTime
## http://rs.tdwg.org/dwc/terms/startDayOfYear                                               <NA>
## http://rs.tdwg.org/dwc/terms/endDayOfYear                                                 <NA>
## http://rs.tdwg.org/dwc/terms/year                                                         year
## http://rs.tdwg.org/dwc/terms/month                                                       month
## http://rs.tdwg.org/dwc/terms/day                                                           day
## http://rs.tdwg.org/dwc/terms/verbatimEventDate                                            <NA>
## http://rs.tdwg.org/dwc/terms/habitat                                                      <NA>
## http://rs.tdwg.org/dwc/terms/fieldNumber                                                  <NA>
## http://rs.tdwg.org/dwc/terms/fieldNotes                                                   <NA>
## http://rs.tdwg.org/dwc/terms/eventRemarks                                                 <NA>
## http://rs.tdwg.org/dwc/terms/locationID                                             locationID
## http://rs.tdwg.org/dwc/terms/higherGeographyID                                            <NA>
## http://rs.tdwg.org/dwc/terms/higherGeography                                              <NA>
## http://rs.tdwg.org/dwc/terms/continent                                                    <NA>
## http://rs.tdwg.org/dwc/terms/waterBody                                                    <NA>
## http://rs.tdwg.org/dwc/terms/islandGroup                                                  <NA>
## http://rs.tdwg.org/dwc/terms/island                                                       <NA>
## http://rs.tdwg.org/dwc/terms/country                                                   country
## http://rs.tdwg.org/dwc/terms/countryCode                                           countryCode
## http://rs.tdwg.org/dwc/terms/stateProvince                                       stateProvince
## http://rs.tdwg.org/dwc/terms/county                                                       <NA>
## http://rs.tdwg.org/dwc/terms/municipality                                         municipality
## http://rs.tdwg.org/dwc/terms/locality                                                 locality
## http://rs.tdwg.org/dwc/terms/verbatimLocality                                             <NA>
## http://rs.tdwg.org/dwc/terms/verbatimElevation                                            <NA>
## http://rs.tdwg.org/dwc/terms/minimumElevationInMeters                 minimumElevationInMeters
## http://rs.tdwg.org/dwc/terms/maximumElevationInMeters                 maximumElevationInMeters
## http://rs.tdwg.org/dwc/terms/verbatimDepth                                                <NA>
## http://rs.tdwg.org/dwc/terms/minimumDepthInMeters                                         <NA>
## http://rs.tdwg.org/dwc/terms/maximumDepthInMeters                                         <NA>
## http://rs.tdwg.org/dwc/terms/minimumDistanceAboveSurfaceInMeters                          <NA>
## http://rs.tdwg.org/dwc/terms/maximumDistanceAboveSurfaceInMeters                          <NA>
## http://rs.tdwg.org/dwc/terms/locationAccordingTo                                          <NA>
## http://rs.tdwg.org/dwc/terms/locationRemarks                                              <NA>
## http://rs.tdwg.org/dwc/terms/verbatimCoordinates                                          <NA>
## http://rs.tdwg.org/dwc/terms/verbatimLatitude                                             <NA>
## http://rs.tdwg.org/dwc/terms/verbatimLongitude                                            <NA>
## http://rs.tdwg.org/dwc/terms/verbatimCoordinateSystem                                     <NA>
## http://rs.tdwg.org/dwc/terms/verbatimSRS                                                  <NA>
## http://rs.tdwg.org/dwc/terms/decimalLatitude                                   decimalLatitude
## http://rs.tdwg.org/dwc/terms/decimalLongitude                                 decimalLongitude
## http://rs.tdwg.org/dwc/terms/geodeticDatum                                       geodeticDatum
## http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters       coordinateUncertaintyInMeters
## http://rs.tdwg.org/dwc/terms/coordinatePrecision                                          <NA>
## http://rs.tdwg.org/dwc/terms/pointRadiusSpatialFit                                        <NA>
## http://rs.tdwg.org/dwc/terms/footprintWKT                                                 <NA>
## http://rs.tdwg.org/dwc/terms/footprintSRS                                                 <NA>
## http://rs.tdwg.org/dwc/terms/footprintSpatialFit                                          <NA>
## http://rs.tdwg.org/dwc/terms/georeferencedBy                                              <NA>
## http://rs.tdwg.org/dwc/terms/georeferencedDate                                            <NA>
## http://rs.tdwg.org/dwc/terms/georeferenceProtocol                                         <NA>
## http://rs.tdwg.org/dwc/terms/georeferenceSources                                          <NA>
## http://rs.tdwg.org/dwc/terms/georeferenceVerificationStatus                               <NA>
## http://rs.tdwg.org/dwc/terms/georeferenceRemarks                                          <NA>
## http://rs.tdwg.org/dwc/terms/geologicalContextID                                          <NA>
## http://rs.tdwg.org/dwc/terms/earliestEonOrLowestEonothem                                  <NA>
## http://rs.tdwg.org/dwc/terms/latestEonOrHighestEonothem                                   <NA>
## http://rs.tdwg.org/dwc/terms/earliestEraOrLowestErathem                                   <NA>
## http://rs.tdwg.org/dwc/terms/latestEraOrHighestErathem                                    <NA>
## http://rs.tdwg.org/dwc/terms/earliestPeriodOrLowestSystem                                 <NA>
## http://rs.tdwg.org/dwc/terms/latestPeriodOrHighestSystem                                  <NA>
## http://rs.tdwg.org/dwc/terms/earliestEpochOrLowestSeries                                  <NA>
## http://rs.tdwg.org/dwc/terms/latestEpochOrHighestSeries                                   <NA>
## http://rs.tdwg.org/dwc/terms/earliestAgeOrLowestStage                                     <NA>
## http://rs.tdwg.org/dwc/terms/latestAgeOrHighestStage                                      <NA>
## http://rs.tdwg.org/dwc/terms/lowestBiostratigraphicZone                                   <NA>
## http://rs.tdwg.org/dwc/terms/highestBiostratigraphicZone                                  <NA>
## http://rs.tdwg.org/dwc/terms/lithostratigraphicTerms                                      <NA>
## http://rs.tdwg.org/dwc/terms/group                                                        <NA>
## http://rs.tdwg.org/dwc/terms/formation                                                    <NA>
## http://rs.tdwg.org/dwc/terms/member                                                       <NA>
## http://rs.tdwg.org/dwc/terms/bed                                                          <NA>
# We can also check that the correct ID column is being used
#  Returns the index of the ID column
newTOVEEventTable$getIDIndex()
## [1] 1
#  Returns the name of the ID column (if the data table has column names)
newTOVEEventTable$getIDName()
## [1] "id"

The next step is to perform the same term mapping for each of the extension data tables. In this instance we know that our extension data table should be of ‘GBIFOccurrence’. However, how do you find the relevant extension data type for your extension data? GBIF supports a large number of extension data types and to go into them all would be outside the scope of this document. However, the ‘getGBIFExtensionClasses’ function will retrive the definition information associated with each extension class with a small description. More detailed descriptions of the extension classes can be found on the Darwin Core Validator website but the output from the ‘getGBIFExtensionClasses’ function will at least give some hints as to which extension class may be suitable for your data type.

# Look at some of the supported GBIF extension classes (here we've shorted it to the first six entries so that the output is not too long)
getGBIFExtensionClasses()[1:6]
## $GBIFMultimedia
## http://rs.tdwg.org/ac/terms/Multimedia - Multimedia
## The Audubon Core is a set of vocabularies designed to represent metadata for biodiversity multimedia resources and collections. These vocabularies aim to represent information that will help to determine whether a particular resource or collection will be fit for some particular biodiversity science application before acquiring the media. Among others, the vocabularies address such concerns as the management of the media and collections, descriptions of their content, their taxonomic, geographic, and temporal coverage, and the appropriate ways to retrieve, attribute and reproduce them.
## 
##  IRI: http://rs.tdwg.org/ac/terms/Multimedia
##  Version IRI: http://rs.tdwg.org/ac/terms/Multimedia
##  Type: class
##  Date modified: 2015-03-19
##  Notes:
##      The Audubon Core is a set of vocabularies designed to represent metadata for biodiversity multimedia resources and collections. These vocabularies aim to represent information that will help to determine whether a particular resource or collection will be fit for some particular biodiversity science application before acquiring the media. Among others, the vocabularies address such concerns as the management of the media and collections, descriptions of their content, their taxonomic, geographic, and temporal coverage, and the appropriate ways to retrieve, attribute and reproduce them.
##  Miscellaneous information:
##      GBIF core/extension class
##  Vocabulary URI:
##      http://terms.tdwg.org/wiki/Audubon_Core_Term_List
## 
## $GBIFChronometricAge
## http://rs.tdwg.org/chrono/terms/ChronometricAge - ChronometricAge
## Extension to Occurrence Core to capture chronometric age information to be used only in cases where the collecting event is not contemporaneous with the time when the dwc:Organism was alive in its context. Collection event information can be reported in dwc:eventDate. See also the normative term list document at https://tdwg.github.io/chrono/list/ and the human-friendly Quick Reference Guide at https://tdwg.github.io/chrono/terms/.
## 
##  IRI: http://rs.tdwg.org/chrono/terms/ChronometricAge
##  Version IRI: http://rs.tdwg.org/chrono/terms/ChronometricAge
##  Type: class
##  Date modified: 2021-03-27
##  Notes:
##      Extension to Occurrence Core to capture chronometric age information to be used only in cases where the collecting event is not contemporaneous with the time when the dwc:Organism was alive in its context. Collection event information can be reported in dwc:eventDate. See also the normative term list document at https://tdwg.github.io/chrono/list/ and the human-friendly Quick Reference Guide at https://tdwg.github.io/chrono/terms/.
##  Miscellaneous information:
##      GBIF core/extension class
##  Vocabulary URI:
##      https://github.com/tdwg/chrono/blob/master/vocabulary/term_versions.csv
## 
## $GBIFIdentification
## http://rs.tdwg.org/dwc/terms/Identification - Identification
## A taxonomic determination (e.g., the assignment to a taxon).
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://rs.tdwg.org/dwc/terms/Identification
##  Version IRI: http://rs.tdwg.org/dwc/terms/version/Identification-2018-09-06
##  Type: Class
##  Date modified: 2018-09-06
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2014-10-26_15
##  Examples:
##      A subspecies determination of an organism.
##  Miscellaneous information:
##      DataSets/DataSet/Units/Unit/Identifications/Identification
## 
## $GBIFMeasurementOrFact
## http://rs.tdwg.org/dwc/terms/MeasurementOrFact - Measurement or Fact
## A measurement of or fact about an rdfs:Resource (http://www.w3.org/2000/01/rdf-schema#Resource).
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://rs.tdwg.org/dwc/terms/MeasurementOrFact
##  Version IRI: http://rs.tdwg.org/dwc/terms/version/MeasurementOrFact-2018-09-06
##  Type: Class
##  Date modified: 2018-09-06
##  Notes:
##      Resources can be thought of as identifiable records or instances of classes and may include, but need not be limited to dwc:Occurrence, dwc:Organism, dwc:MaterialSample, dwc:Event, dwc:Location, dwc:GeologicalContext, dwc:Identification, or dwc:Taxon.
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2014-10-26_15
##  Examples:
##      The weight of an organism in grams. The number of placental scars. Surface water temperature in Celsius.
##  Miscellaneous information:
##      Datasets/Dataset/Units/Unit/MeasurementsOrFacts or DataSets/DataSet/Units/Unit/Gathering/SiteMeasurementsOrFacts
## 
## $GBIFResourceRelationship
## http://rs.tdwg.org/dwc/terms/ResourceRelationship - Resource Relationship
## A relationship of one rdfs:Resource (http://www.w3.org/2000/01/rdf-schema#Resource) to another.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://rs.tdwg.org/dwc/terms/ResourceRelationship
##  Version IRI: http://rs.tdwg.org/dwc/terms/version/ResourceRelationship-2018-09-06
##  Type: Class
##  Date modified: 2018-09-06
##  Notes:
##      Resources can be thought of as identifiable records or instances of classes and may include, but need not be limited to dwc:Occurrence, dwc:Organism, dwc:MaterialSample, dwc:Event, dwc:Location, dwc:GeologicalContext, dwc:Identification, or dwc:Taxon.
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2014-10-26_15
##  Examples:
##      An instance of an Organism is the mother of another instance of an Organism. A uniquely identified Occurrence represents the same Occurrence as another uniquely identified Occurrence. A MaterialSample is a subsample of another MaterialSample.
##  Miscellaneous information:
##      DataSets/DataSet/Units/Unit/Associations
## 
## $GBIFEOLMediaExtension
## http://www.eol.org/schema/transfer#/EOLMediaExtension - EOLMediaExtension
## This extension draws from Audubon Core, Dublin Core and others to gather information about text and multimedia. It was designed to contain all the metadata that is required to be indexed by the Encyclopedia of Life (EOL), but this extension is hopefully general enough to be useful to all text and media providers and consumers. The original extension was offline; this is a copy recovered from the Internet Archive.  The issue date is estimated.
## 
##  IRI: http://eol.org/schema/media/Document
##  Version IRI: http://eol.org/schema/media/Document
##  Type: class
##  Date modified: 2010-01-01
##  Notes:
##      This extension draws from Audubon Core, Dublin Core and others to gather information about text and multimedia. It was designed to contain all the metadata that is required to be indexed by the Encyclopedia of Life (EOL), but this extension is hopefully general enough to be useful to all text and media providers and consumers. The original extension was offline; this is a copy recovered from the Internet Archive.  The issue date is estimated.
##  Miscellaneous information:
##      GBIF core/extension class
##  Vocabulary URI:
##      http://eol.org/info/cp_archives

We can initialise the occurrence extension table for our data set in a similar manner to how we initialised the event core data type.

# Look at the Darwin Core terms associated with occurrence data tables (here we've shortened it to the first 6 entries so that the output is not too long)
getGBIFOccurrenceMembers()[1:6]
## $`http://purl.org/dc/terms/type`
## http://purl.org/dc/terms/type - Type (DEPRECATED)
## The nature or genre of the resource.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://purl.org/dc/terms/type
##  Type: Property
##  Date modified: 2008-01-14
##  Is replaced by: http://purl.org/dc/elements/1.1/type
##  Notes:
##      To provide a string literal value for type, use dc:type rather than this term. In accordance with the Darwin Core RDF guide, rdf:type should be used instead of this term to indicate an IRI value for type.
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2009-12-07_1
##      http://rs.tdwg.org/decisions/decision-2019-12-01_19
##      http://rs.tdwg.org/decisions/decision-2019-12-01_20
##  Miscellaneous information:
##      This term is deprecated and should no longer be used.
##      not in ABCD
## 
## $`http://purl.org/dc/terms/modified`
## http://purl.org/dc/terms/modified - Date Modified
## The most recent date-time on which the resource was changed.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://purl.org/dc/terms/modified
##  Version IRI: http://dublincore.org/usage/terms/history/#modified-003
##  Type: Property
##  Date modified: 2020-08-12
##  Notes:
##      Recommended best practice is to use a date that conforms to ISO 8601-1:2019.
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2019-12-01_19
##  Examples:
##      1963-03-08T14:07-0600 (8 Mar 1963 at 2:07pm in the time zone six hours earlier than UTC). 2009-02-20T08:40Z (20 February 2009 8:40am UTC). 2018-08-29T15:19 (3:19pm local time on 29 August 2018). 1809-02-12 (some time during 12 February 1809). 1906-06 (some time in June 1906). 1971 (some time in the year 1971). 2007-03-01T13:00:00Z/2008-05-11T15:30:00Z (some time during the interval between 1 March 2007 1pm UTC and 11 May 2008 3:30pm UTC). 1900/1909 (some time during the interval between the beginning of the year 1900 and the end of the year 1909). 2007-11-13/15 (some time in the interval between 13 November 2007 and 15 November 2007).
##  Miscellaneous information:
##      not in ABCD
## 
## $`http://purl.org/dc/terms/language`
## http://purl.org/dc/terms/language - Language
## A language of the resource.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://purl.org/dc/terms/language
##  Version IRI: http://dublincore.org/usage/terms/history/#languageT-001
##  Type: Property
##  Date modified: 2008-01-14
##  Notes:
##      Recommended best practice is to use an IRI from the Library of Congress ISO 639-2 scheme http://id.loc.gov/vocabulary/iso639-2
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2019-12-01_19
##  Miscellaneous information:
##      not in ABCD
## 
## $`http://purl.org/dc/terms/license`
## http://purl.org/dc/terms/license - License
## A legal document giving official permission to do something with the resource.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://purl.org/dc/terms/license
##  Version IRI: http://dublincore.org/usage/terms/history/#license-002
##  Type: Property
##  Date modified: 2008-01-14
##  Executive committee decisions:
##      http://rs.tdwg.org/decisions/decision-2014-11-06_17
##  Examples:
##      http://creativecommons.org/publicdomain/zero/1.0/legalcode, http://creativecommons.org/licenses/by/4.0/legalcode
##  Miscellaneous information:
##      not in ABCD
## 
## $`http://purl.org/dc/terms/rightsHolder`
## http://purl.org/dc/terms/rightsHolder - Rights Holder
## A person or organization owning or managing rights over the resource.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://purl.org/dc/terms/rightsHolder
##  Version IRI: http://dublincore.org/usage/terms/history/#rightsHolder-002
##  Type: Property
##  Date modified: 2008-01-14
##  Examples:
##      The Regents of the University of California
##  Miscellaneous information:
##      not in ABCD
## 
## $`http://purl.org/dc/terms/accessRights`
## http://purl.org/dc/terms/accessRights - Access Rights
## Information about who can access the resource or an indication of its security status.
## 
##  Defined in: https://dwc.tdwg.org/
##  IRI: http://purl.org/dc/terms/accessRights
##  Version IRI: http://dublincore.org/usage/terms/history/#accessRights-002
##  Type: Property
##  Date modified: 2008-01-14
##  Notes:
##      Access Rights may include information regarding access or restrictions based on privacy, security, or other policies.
##  Examples:
##      not-for-profit use only, https://www.fieldmuseum.org/field-museum-natural-history-conditions-and-suggested-norms-use-collections-data-and-images
##  Miscellaneous information:
##      not in ABCD
# Call the initialisation function
newTOVEOccTable <- initializeGBIFOccurrence(TOVEOccTableDF, "id", nameAutoMap = TRUE)
# We can then check to see if the terms are mapped correctly
# Terms with NA in the column index represent Darwin Core terms associated with the event data table that are not mapped
# Not all terms need to be mapped to make a valid Darwin Core archive
newTOVEOccTable$getTermMapping()
##                                                                  columnIndex
## http://purl.org/dc/terms/type                                              2
## http://purl.org/dc/terms/modified                                          3
## http://purl.org/dc/terms/language                                         NA
## http://purl.org/dc/terms/license                                          NA
## http://purl.org/dc/terms/rightsHolder                                     NA
## http://purl.org/dc/terms/accessRights                                     NA
## http://purl.org/dc/terms/bibliographicCitation                            NA
## http://purl.org/dc/terms/references                                       NA
## http://rs.tdwg.org/dwc/terms/institutionID                                NA
## http://rs.tdwg.org/dwc/terms/collectionID                                 NA
## http://rs.tdwg.org/dwc/terms/datasetID                                    NA
## http://rs.tdwg.org/dwc/terms/institutionCode                              NA
## http://rs.tdwg.org/dwc/terms/collectionCode                                4
## http://rs.tdwg.org/dwc/terms/datasetName                                  NA
## http://rs.tdwg.org/dwc/terms/ownerInstitutionCode                         NA
## http://rs.tdwg.org/dwc/terms/basisOfRecord                                 5
## http://rs.tdwg.org/dwc/terms/informationWithheld                          NA
## http://rs.tdwg.org/dwc/terms/dataGeneralizations                          NA
## http://rs.tdwg.org/dwc/terms/dynamicProperties                            NA
## http://rs.tdwg.org/dwc/terms/materialSampleID                             NA
## http://rs.tdwg.org/dwc/terms/occurrenceID                                  6
## http://rs.tdwg.org/dwc/terms/catalogNumber                                 7
## http://rs.tdwg.org/dwc/terms/occurrenceRemarks                             8
## http://rs.tdwg.org/dwc/terms/recordNumber                                 NA
## http://rs.tdwg.org/dwc/terms/recordedBy                                   NA
## http://rs.gbif.org/terms/1.0/recordedByID                                 NA
## http://rs.tdwg.org/dwc/terms/individualCount                              NA
## http://rs.tdwg.org/dwc/terms/organismQuantity                              9
## http://rs.tdwg.org/dwc/terms/organismQuantityType                         10
## http://rs.tdwg.org/dwc/terms/sex                                          NA
## http://rs.tdwg.org/dwc/terms/lifeStage                                    NA
## http://rs.tdwg.org/dwc/terms/reproductiveCondition                        NA
## http://rs.tdwg.org/dwc/terms/behavior                                     NA
## http://rs.tdwg.org/dwc/terms/establishmentMeans                           NA
## http://rs.tdwg.org/dwc/terms/occurrenceStatus                             11
## http://rs.tdwg.org/dwc/terms/preparations                                 NA
## http://rs.tdwg.org/dwc/terms/disposition                                  NA
## http://rs.tdwg.org/dwc/terms/otherCatalogNumbers                          NA
## http://rs.tdwg.org/dwc/terms/associatedMedia                              NA
## http://rs.tdwg.org/dwc/terms/associatedReferences                         NA
## http://rs.tdwg.org/dwc/terms/associatedSequences                          NA
## http://rs.tdwg.org/dwc/terms/associatedTaxa                               NA
## http://rs.tdwg.org/dwc/terms/organismID                                   NA
## http://rs.tdwg.org/dwc/terms/organismName                                 NA
## http://rs.tdwg.org/dwc/terms/organismScope                                NA
## http://rs.tdwg.org/dwc/terms/associatedOccurrences                        NA
## http://rs.tdwg.org/dwc/terms/associatedOrganisms                          NA
## http://rs.tdwg.org/dwc/terms/previousIdentifications                      NA
## http://rs.tdwg.org/dwc/terms/organismRemarks                              NA
## http://rs.tdwg.org/dwc/terms/eventID                                      12
## http://rs.tdwg.org/dwc/terms/parentEventID                                NA
## http://rs.tdwg.org/dwc/terms/samplingProtocol                             NA
## http://rs.tdwg.org/dwc/terms/sampleSizeValue                              NA
## http://rs.tdwg.org/dwc/terms/sampleSizeUnit                               NA
## http://rs.tdwg.org/dwc/terms/samplingEffort                               NA
## http://rs.tdwg.org/dwc/terms/eventDate                                    NA
## http://rs.tdwg.org/dwc/terms/eventTime                                    NA
## http://rs.tdwg.org/dwc/terms/startDayOfYear                               NA
## http://rs.tdwg.org/dwc/terms/endDayOfYear                                 NA
## http://rs.tdwg.org/dwc/terms/year                                         NA
## http://rs.tdwg.org/dwc/terms/month                                        NA
## http://rs.tdwg.org/dwc/terms/day                                          NA
## http://rs.tdwg.org/dwc/terms/verbatimEventDate                            NA
## http://rs.tdwg.org/dwc/terms/habitat                                      NA
## http://rs.tdwg.org/dwc/terms/fieldNumber                                  NA
## http://rs.tdwg.org/dwc/terms/fieldNotes                                   NA
## http://rs.tdwg.org/dwc/terms/eventRemarks                                 NA
## http://rs.tdwg.org/dwc/terms/locationID                                   NA
## http://rs.tdwg.org/dwc/terms/higherGeographyID                            NA
## http://rs.tdwg.org/dwc/terms/higherGeography                              NA
## http://rs.tdwg.org/dwc/terms/continent                                    NA
## http://rs.tdwg.org/dwc/terms/waterBody                                    NA
## http://rs.tdwg.org/dwc/terms/islandGroup                                  NA
## http://rs.tdwg.org/dwc/terms/island                                       NA
## http://rs.tdwg.org/dwc/terms/country                                      NA
## http://rs.tdwg.org/dwc/terms/countryCode                                  NA
## http://rs.tdwg.org/dwc/terms/stateProvince                                NA
## http://rs.tdwg.org/dwc/terms/county                                       NA
## http://rs.tdwg.org/dwc/terms/municipality                                 NA
## http://rs.tdwg.org/dwc/terms/locality                                     NA
## http://rs.tdwg.org/dwc/terms/verbatimLocality                             NA
## http://rs.tdwg.org/dwc/terms/verbatimElevation                            NA
## http://rs.tdwg.org/dwc/terms/minimumElevationInMeters                     NA
## http://rs.tdwg.org/dwc/terms/maximumElevationInMeters                     NA
## http://rs.tdwg.org/dwc/terms/verbatimDepth                                NA
## http://rs.tdwg.org/dwc/terms/minimumDepthInMeters                         NA
## http://rs.tdwg.org/dwc/terms/maximumDepthInMeters                         NA
## http://rs.tdwg.org/dwc/terms/minimumDistanceAboveSurfaceInMeters          NA
## http://rs.tdwg.org/dwc/terms/maximumDistanceAboveSurfaceInMeters          NA
## http://rs.tdwg.org/dwc/terms/locationAccordingTo                          NA
## http://rs.tdwg.org/dwc/terms/locationRemarks                              NA
## http://rs.tdwg.org/dwc/terms/verbatimCoordinates                          NA
## http://rs.tdwg.org/dwc/terms/verbatimLatitude                             NA
## http://rs.tdwg.org/dwc/terms/verbatimLongitude                            NA
## http://rs.tdwg.org/dwc/terms/verbatimCoordinateSystem                     NA
## http://rs.tdwg.org/dwc/terms/verbatimSRS                                  NA
## http://rs.tdwg.org/dwc/terms/decimalLatitude                              NA
## http://rs.tdwg.org/dwc/terms/decimalLongitude                             NA
## http://rs.tdwg.org/dwc/terms/geodeticDatum                                NA
## http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters                NA
## http://rs.tdwg.org/dwc/terms/coordinatePrecision                          NA
## http://rs.tdwg.org/dwc/terms/pointRadiusSpatialFit                        NA
## http://rs.tdwg.org/dwc/terms/footprintWKT                                 NA
## http://rs.tdwg.org/dwc/terms/footprintSRS                                 NA
## http://rs.tdwg.org/dwc/terms/footprintSpatialFit                          NA
## http://rs.tdwg.org/dwc/terms/georeferencedBy                              NA
## http://rs.tdwg.org/dwc/terms/georeferencedDate                            NA
## http://rs.tdwg.org/dwc/terms/georeferenceProtocol                         NA
## http://rs.tdwg.org/dwc/terms/georeferenceSources                          NA
## http://rs.tdwg.org/dwc/terms/georeferenceVerificationStatus               NA
## http://rs.tdwg.org/dwc/terms/georeferenceRemarks                          NA
## http://rs.tdwg.org/dwc/terms/geologicalContextID                          NA
## http://rs.tdwg.org/dwc/terms/earliestEonOrLowestEonothem                  NA
## http://rs.tdwg.org/dwc/terms/latestEonOrHighestEonothem                   NA
## http://rs.tdwg.org/dwc/terms/earliestEraOrLowestErathem                   NA
## http://rs.tdwg.org/dwc/terms/latestEraOrHighestErathem                    NA
## http://rs.tdwg.org/dwc/terms/earliestPeriodOrLowestSystem                 NA
## http://rs.tdwg.org/dwc/terms/latestPeriodOrHighestSystem                  NA
## http://rs.tdwg.org/dwc/terms/earliestEpochOrLowestSeries                  NA
## http://rs.tdwg.org/dwc/terms/latestEpochOrHighestSeries                   NA
## http://rs.tdwg.org/dwc/terms/earliestAgeOrLowestStage                     NA
## http://rs.tdwg.org/dwc/terms/latestAgeOrHighestStage                      NA
## http://rs.tdwg.org/dwc/terms/lowestBiostratigraphicZone                   NA
## http://rs.tdwg.org/dwc/terms/highestBiostratigraphicZone                  NA
## http://rs.tdwg.org/dwc/terms/lithostratigraphicTerms                      NA
## http://rs.tdwg.org/dwc/terms/group                                        NA
## http://rs.tdwg.org/dwc/terms/formation                                    NA
## http://rs.tdwg.org/dwc/terms/member                                       NA
## http://rs.tdwg.org/dwc/terms/bed                                          NA
## http://rs.tdwg.org/dwc/terms/identificationID                             NA
## http://rs.tdwg.org/dwc/terms/identifiedBy                                 NA
## http://rs.gbif.org/terms/1.0/identifiedByID                               NA
## http://rs.tdwg.org/dwc/terms/dateIdentified                               NA
## http://rs.tdwg.org/dwc/terms/identificationReferences                     NA
## http://rs.tdwg.org/dwc/terms/identificationRemarks                        NA
## http://rs.tdwg.org/dwc/terms/identificationQualifier                      NA
## http://rs.tdwg.org/dwc/terms/identificationVerificationStatus             NA
## http://rs.tdwg.org/dwc/terms/typeStatus                                   NA
## http://rs.tdwg.org/dwc/terms/taxonID                                      NA
## http://rs.tdwg.org/dwc/terms/scientificNameID                             NA
## http://rs.tdwg.org/dwc/terms/acceptedNameUsageID                          NA
## http://rs.tdwg.org/dwc/terms/parentNameUsageID                            NA
## http://rs.tdwg.org/dwc/terms/originalNameUsageID                          NA
## http://rs.tdwg.org/dwc/terms/nameAccordingToID                            NA
## http://rs.tdwg.org/dwc/terms/namePublishedInID                            NA
## http://rs.tdwg.org/dwc/terms/taxonConceptID                               NA
## http://rs.tdwg.org/dwc/terms/scientificName                               13
## http://rs.tdwg.org/dwc/terms/acceptedNameUsage                            NA
## http://rs.tdwg.org/dwc/terms/parentNameUsage                              NA
## http://rs.tdwg.org/dwc/terms/originalNameUsage                            NA
## http://rs.tdwg.org/dwc/terms/nameAccordingTo                              NA
## http://rs.tdwg.org/dwc/terms/namePublishedIn                              NA
## http://rs.tdwg.org/dwc/terms/namePublishedInYear                          NA
## http://rs.tdwg.org/dwc/terms/higherClassification                         NA
## http://rs.tdwg.org/dwc/terms/kingdom                                      14
## http://rs.tdwg.org/dwc/terms/phylum                                       15
## http://rs.tdwg.org/dwc/terms/class                                        16
## http://rs.tdwg.org/dwc/terms/order                                        17
## http://rs.tdwg.org/dwc/terms/family                                       18
## http://rs.tdwg.org/dwc/terms/genus                                        19
## http://rs.tdwg.org/dwc/terms/subgenus                                     NA
## http://rs.tdwg.org/dwc/terms/specificEpithet                              20
## http://rs.tdwg.org/dwc/terms/infraspecificEpithet                         NA
## http://rs.tdwg.org/dwc/terms/taxonRank                                    NA
## http://rs.tdwg.org/dwc/terms/verbatimTaxonRank                            NA
## http://rs.tdwg.org/dwc/terms/scientificNameAuthorship                     NA
## http://rs.tdwg.org/dwc/terms/vernacularName                               21
## http://rs.tdwg.org/dwc/terms/nomenclaturalCode                            NA
## http://rs.tdwg.org/dwc/terms/taxonomicStatus                              NA
## http://rs.tdwg.org/dwc/terms/nomenclaturalStatus                          NA
## http://rs.tdwg.org/dwc/terms/taxonRemarks                                 NA
##                                                                            columnName
## http://purl.org/dc/terms/type                                                    type
## http://purl.org/dc/terms/modified                                            modified
## http://purl.org/dc/terms/language                                                <NA>
## http://purl.org/dc/terms/license                                                 <NA>
## http://purl.org/dc/terms/rightsHolder                                            <NA>
## http://purl.org/dc/terms/accessRights                                            <NA>
## http://purl.org/dc/terms/bibliographicCitation                                   <NA>
## http://purl.org/dc/terms/references                                              <NA>
## http://rs.tdwg.org/dwc/terms/institutionID                                       <NA>
## http://rs.tdwg.org/dwc/terms/collectionID                                        <NA>
## http://rs.tdwg.org/dwc/terms/datasetID                                           <NA>
## http://rs.tdwg.org/dwc/terms/institutionCode                                     <NA>
## http://rs.tdwg.org/dwc/terms/collectionCode                            collectionCode
## http://rs.tdwg.org/dwc/terms/datasetName                                         <NA>
## http://rs.tdwg.org/dwc/terms/ownerInstitutionCode                                <NA>
## http://rs.tdwg.org/dwc/terms/basisOfRecord                              basisOfRecord
## http://rs.tdwg.org/dwc/terms/informationWithheld                                 <NA>
## http://rs.tdwg.org/dwc/terms/dataGeneralizations                                 <NA>
## http://rs.tdwg.org/dwc/terms/dynamicProperties                                   <NA>
## http://rs.tdwg.org/dwc/terms/materialSampleID                                    <NA>
## http://rs.tdwg.org/dwc/terms/occurrenceID                                occurrenceID
## http://rs.tdwg.org/dwc/terms/catalogNumber                              catalogNumber
## http://rs.tdwg.org/dwc/terms/occurrenceRemarks                      occurrenceRemarks
## http://rs.tdwg.org/dwc/terms/recordNumber                                        <NA>
## http://rs.tdwg.org/dwc/terms/recordedBy                                          <NA>
## http://rs.gbif.org/terms/1.0/recordedByID                                        <NA>
## http://rs.tdwg.org/dwc/terms/individualCount                                     <NA>
## http://rs.tdwg.org/dwc/terms/organismQuantity                        organismQuantity
## http://rs.tdwg.org/dwc/terms/organismQuantityType                organismQuantityType
## http://rs.tdwg.org/dwc/terms/sex                                                 <NA>
## http://rs.tdwg.org/dwc/terms/lifeStage                                           <NA>
## http://rs.tdwg.org/dwc/terms/reproductiveCondition                               <NA>
## http://rs.tdwg.org/dwc/terms/behavior                                            <NA>
## http://rs.tdwg.org/dwc/terms/establishmentMeans                                  <NA>
## http://rs.tdwg.org/dwc/terms/occurrenceStatus                        occurrenceStatus
## http://rs.tdwg.org/dwc/terms/preparations                                        <NA>
## http://rs.tdwg.org/dwc/terms/disposition                                         <NA>
## http://rs.tdwg.org/dwc/terms/otherCatalogNumbers                                 <NA>
## http://rs.tdwg.org/dwc/terms/associatedMedia                                     <NA>
## http://rs.tdwg.org/dwc/terms/associatedReferences                                <NA>
## http://rs.tdwg.org/dwc/terms/associatedSequences                                 <NA>
## http://rs.tdwg.org/dwc/terms/associatedTaxa                                      <NA>
## http://rs.tdwg.org/dwc/terms/organismID                                          <NA>
## http://rs.tdwg.org/dwc/terms/organismName                                        <NA>
## http://rs.tdwg.org/dwc/terms/organismScope                                       <NA>
## http://rs.tdwg.org/dwc/terms/associatedOccurrences                               <NA>
## http://rs.tdwg.org/dwc/terms/associatedOrganisms                                 <NA>
## http://rs.tdwg.org/dwc/terms/previousIdentifications                             <NA>
## http://rs.tdwg.org/dwc/terms/organismRemarks                                     <NA>
## http://rs.tdwg.org/dwc/terms/eventID                                          eventID
## http://rs.tdwg.org/dwc/terms/parentEventID                                       <NA>
## http://rs.tdwg.org/dwc/terms/samplingProtocol                                    <NA>
## http://rs.tdwg.org/dwc/terms/sampleSizeValue                                     <NA>
## http://rs.tdwg.org/dwc/terms/sampleSizeUnit                                      <NA>
## http://rs.tdwg.org/dwc/terms/samplingEffort                                      <NA>
## http://rs.tdwg.org/dwc/terms/eventDate                                           <NA>
## http://rs.tdwg.org/dwc/terms/eventTime                                           <NA>
## http://rs.tdwg.org/dwc/terms/startDayOfYear                                      <NA>
## http://rs.tdwg.org/dwc/terms/endDayOfYear                                        <NA>
## http://rs.tdwg.org/dwc/terms/year                                                <NA>
## http://rs.tdwg.org/dwc/terms/month                                               <NA>
## http://rs.tdwg.org/dwc/terms/day                                                 <NA>
## http://rs.tdwg.org/dwc/terms/verbatimEventDate                                   <NA>
## http://rs.tdwg.org/dwc/terms/habitat                                             <NA>
## http://rs.tdwg.org/dwc/terms/fieldNumber                                         <NA>
## http://rs.tdwg.org/dwc/terms/fieldNotes                                          <NA>
## http://rs.tdwg.org/dwc/terms/eventRemarks                                        <NA>
## http://rs.tdwg.org/dwc/terms/locationID                                          <NA>
## http://rs.tdwg.org/dwc/terms/higherGeographyID                                   <NA>
## http://rs.tdwg.org/dwc/terms/higherGeography                                     <NA>
## http://rs.tdwg.org/dwc/terms/continent                                           <NA>
## http://rs.tdwg.org/dwc/terms/waterBody                                           <NA>
## http://rs.tdwg.org/dwc/terms/islandGroup                                         <NA>
## http://rs.tdwg.org/dwc/terms/island                                              <NA>
## http://rs.tdwg.org/dwc/terms/country                                             <NA>
## http://rs.tdwg.org/dwc/terms/countryCode                                         <NA>
## http://rs.tdwg.org/dwc/terms/stateProvince                                       <NA>
## http://rs.tdwg.org/dwc/terms/county                                              <NA>
## http://rs.tdwg.org/dwc/terms/municipality                                        <NA>
## http://rs.tdwg.org/dwc/terms/locality                                            <NA>
## http://rs.tdwg.org/dwc/terms/verbatimLocality                                    <NA>
## http://rs.tdwg.org/dwc/terms/verbatimElevation                                   <NA>
## http://rs.tdwg.org/dwc/terms/minimumElevationInMeters                            <NA>
## http://rs.tdwg.org/dwc/terms/maximumElevationInMeters                            <NA>
## http://rs.tdwg.org/dwc/terms/verbatimDepth                                       <NA>
## http://rs.tdwg.org/dwc/terms/minimumDepthInMeters                                <NA>
## http://rs.tdwg.org/dwc/terms/maximumDepthInMeters                                <NA>
## http://rs.tdwg.org/dwc/terms/minimumDistanceAboveSurfaceInMeters                 <NA>
## http://rs.tdwg.org/dwc/terms/maximumDistanceAboveSurfaceInMeters                 <NA>
## http://rs.tdwg.org/dwc/terms/locationAccordingTo                                 <NA>
## http://rs.tdwg.org/dwc/terms/locationRemarks                                     <NA>
## http://rs.tdwg.org/dwc/terms/verbatimCoordinates                                 <NA>
## http://rs.tdwg.org/dwc/terms/verbatimLatitude                                    <NA>
## http://rs.tdwg.org/dwc/terms/verbatimLongitude                                   <NA>
## http://rs.tdwg.org/dwc/terms/verbatimCoordinateSystem                            <NA>
## http://rs.tdwg.org/dwc/terms/verbatimSRS                                         <NA>
## http://rs.tdwg.org/dwc/terms/decimalLatitude                                     <NA>
## http://rs.tdwg.org/dwc/terms/decimalLongitude                                    <NA>
## http://rs.tdwg.org/dwc/terms/geodeticDatum                                       <NA>
## http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters                       <NA>
## http://rs.tdwg.org/dwc/terms/coordinatePrecision                                 <NA>
## http://rs.tdwg.org/dwc/terms/pointRadiusSpatialFit                               <NA>
## http://rs.tdwg.org/dwc/terms/footprintWKT                                        <NA>
## http://rs.tdwg.org/dwc/terms/footprintSRS                                        <NA>
## http://rs.tdwg.org/dwc/terms/footprintSpatialFit                                 <NA>
## http://rs.tdwg.org/dwc/terms/georeferencedBy                                     <NA>
## http://rs.tdwg.org/dwc/terms/georeferencedDate                                   <NA>
## http://rs.tdwg.org/dwc/terms/georeferenceProtocol                                <NA>
## http://rs.tdwg.org/dwc/terms/georeferenceSources                                 <NA>
## http://rs.tdwg.org/dwc/terms/georeferenceVerificationStatus                      <NA>
## http://rs.tdwg.org/dwc/terms/georeferenceRemarks                                 <NA>
## http://rs.tdwg.org/dwc/terms/geologicalContextID                                 <NA>
## http://rs.tdwg.org/dwc/terms/earliestEonOrLowestEonothem                         <NA>
## http://rs.tdwg.org/dwc/terms/latestEonOrHighestEonothem                          <NA>
## http://rs.tdwg.org/dwc/terms/earliestEraOrLowestErathem                          <NA>
## http://rs.tdwg.org/dwc/terms/latestEraOrHighestErathem                           <NA>
## http://rs.tdwg.org/dwc/terms/earliestPeriodOrLowestSystem                        <NA>
## http://rs.tdwg.org/dwc/terms/latestPeriodOrHighestSystem                         <NA>
## http://rs.tdwg.org/dwc/terms/earliestEpochOrLowestSeries                         <NA>
## http://rs.tdwg.org/dwc/terms/latestEpochOrHighestSeries                          <NA>
## http://rs.tdwg.org/dwc/terms/earliestAgeOrLowestStage                            <NA>
## http://rs.tdwg.org/dwc/terms/latestAgeOrHighestStage                             <NA>
## http://rs.tdwg.org/dwc/terms/lowestBiostratigraphicZone                          <NA>
## http://rs.tdwg.org/dwc/terms/highestBiostratigraphicZone                         <NA>
## http://rs.tdwg.org/dwc/terms/lithostratigraphicTerms                             <NA>
## http://rs.tdwg.org/dwc/terms/group                                               <NA>
## http://rs.tdwg.org/dwc/terms/formation                                           <NA>
## http://rs.tdwg.org/dwc/terms/member                                              <NA>
## http://rs.tdwg.org/dwc/terms/bed                                                 <NA>
## http://rs.tdwg.org/dwc/terms/identificationID                                    <NA>
## http://rs.tdwg.org/dwc/terms/identifiedBy                                        <NA>
## http://rs.gbif.org/terms/1.0/identifiedByID                                      <NA>
## http://rs.tdwg.org/dwc/terms/dateIdentified                                      <NA>
## http://rs.tdwg.org/dwc/terms/identificationReferences                            <NA>
## http://rs.tdwg.org/dwc/terms/identificationRemarks                               <NA>
## http://rs.tdwg.org/dwc/terms/identificationQualifier                             <NA>
## http://rs.tdwg.org/dwc/terms/identificationVerificationStatus                    <NA>
## http://rs.tdwg.org/dwc/terms/typeStatus                                          <NA>
## http://rs.tdwg.org/dwc/terms/taxonID                                             <NA>
## http://rs.tdwg.org/dwc/terms/scientificNameID                                    <NA>
## http://rs.tdwg.org/dwc/terms/acceptedNameUsageID                                 <NA>
## http://rs.tdwg.org/dwc/terms/parentNameUsageID                                   <NA>
## http://rs.tdwg.org/dwc/terms/originalNameUsageID                                 <NA>
## http://rs.tdwg.org/dwc/terms/nameAccordingToID                                   <NA>
## http://rs.tdwg.org/dwc/terms/namePublishedInID                                   <NA>
## http://rs.tdwg.org/dwc/terms/taxonConceptID                                      <NA>
## http://rs.tdwg.org/dwc/terms/scientificName                            scientificName
## http://rs.tdwg.org/dwc/terms/acceptedNameUsage                                   <NA>
## http://rs.tdwg.org/dwc/terms/parentNameUsage                                     <NA>
## http://rs.tdwg.org/dwc/terms/originalNameUsage                                   <NA>
## http://rs.tdwg.org/dwc/terms/nameAccordingTo                                     <NA>
## http://rs.tdwg.org/dwc/terms/namePublishedIn                                     <NA>
## http://rs.tdwg.org/dwc/terms/namePublishedInYear                                 <NA>
## http://rs.tdwg.org/dwc/terms/higherClassification                                <NA>
## http://rs.tdwg.org/dwc/terms/kingdom                                          kingdom
## http://rs.tdwg.org/dwc/terms/phylum                                            phylum
## http://rs.tdwg.org/dwc/terms/class                                              class
## http://rs.tdwg.org/dwc/terms/order                                              order
## http://rs.tdwg.org/dwc/terms/family                                            family
## http://rs.tdwg.org/dwc/terms/genus                                              genus
## http://rs.tdwg.org/dwc/terms/subgenus                                            <NA>
## http://rs.tdwg.org/dwc/terms/specificEpithet                          specificEpithet
## http://rs.tdwg.org/dwc/terms/infraspecificEpithet                                <NA>
## http://rs.tdwg.org/dwc/terms/taxonRank                                           <NA>
## http://rs.tdwg.org/dwc/terms/verbatimTaxonRank                                   <NA>
## http://rs.tdwg.org/dwc/terms/scientificNameAuthorship                            <NA>
## http://rs.tdwg.org/dwc/terms/vernacularName                            vernacularName
## http://rs.tdwg.org/dwc/terms/nomenclaturalCode                                   <NA>
## http://rs.tdwg.org/dwc/terms/taxonomicStatus                                     <NA>
## http://rs.tdwg.org/dwc/terms/nomenclaturalStatus                                 <NA>
## http://rs.tdwg.org/dwc/terms/taxonRemarks                                        <NA>

Creating Metadata

In Darwin Core archive files the standard format for handling metadata is EML, a flavour of XML specifically designed for the handling of ecological metadata. We can extract the EML file for the TOV-E data from the Darwin Core archive file using the following code:

# Extract the EML file from the Darwin Core archive file
unzip(localDataLoc, "eml.xml", exdir = tempDirLoc)
# Print the first few lines of the EML file to get an idea of the structure of the file
cat(readLines(con = file.path(tempDirLoc, "eml.xml"), n = 20, encoding = "UTF-8"), sep = "\n")
## <eml:eml xmlns:eml="eml://ecoinformatics.org/eml-2.1.1"
##          xmlns:dc="http://purl.org/dc/terms/"
##          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
##          xsi:schemaLocation="eml://ecoinformatics.org/eml-2.1.1 http://rs.gbif.org/schema/eml-gbif-profile/1.1/eml.xsd"
##          packageId="4a00502d-6342-4294-aad1-9727e5c24041/v1.6" system="http://gbif.org" scope="system"
##          xml:lang="eng">
## 
## <dataset>
##   <alternateIdentifier>4a00502d-6342-4294-aad1-9727e5c24041</alternateIdentifier>
##   <alternateIdentifier>https://ipt.nina.no/resource?r=tove_birdsampling</alternateIdentifier>
##   <title xml:lang="eng">TOV-E Bird monitoring sampling data</title>
##       <creator>
##     <individualName>
##         <givenName>John Atle</givenName>
##       <surName>Kålås</surName>
##     </individualName>
##     <organizationName>Norwegian Institute for Nature Research</organizationName>
##     <positionName>Senior Researcher</positionName>
##     <address>
##         <deliveryPoint>Postboks 5685 Sluppen</deliveryPoint>

It is also possible to look at the entire EML file in a seperate browser by running the following code:

browseURL(file.path(tempDirLoc, "eml.xml"))

From the EML file we can generate a DwCMetadata object by calling the ‘initializeDwCMetadata’ function. This function takes one argument, which is the location of the file to import the metadata information from. This can be an EML file or a Darwin Core archive file.

# Initialise the metadata from the EML file extracted from the Darwin Core archive
newTOVEMetadata <- initializeDwCMetadata(file.path(tempDirLoc, "eml.xml"),
  fileType = "eml" # This line is not required if the file has the ".xml" file extension
)
# Alternatively the metadata object can be imported directly from the Darwin Core archive file
newTOVEMetadata <- initializeDwCMetadata(localDataLoc,
  fileType = "darwincore" # This line is not required if the file has the ".zip" file extension
)

However, these methods assume that the researcher already has an EML file that can be used for initialisation of the metadata object. In many instances, the researcher will not have the EML readily available, and in situations where the researcher is unfamiliar with the standard, the process to create an EML can be rather laborious. To help alleviate this, the Living Norway adds a number of functions to aid in the creation of EML files. Instead of formatting the metadata according to the EML standard, the researcher can instead use R markdown to create a text document describing the dataset. In this text document the researcher can simply add tagging functions to sections of text that they wish to be exported to the EML file. (It is important to note here that there are a few different R packages that one can use to make valid EML (see: EML with R) all of which have different challenges associated with them.)

R markdown is a simple text-based documentation language and can be thought of as an extension of simple text files with some extra support for formatting and display of text elements. In addition R markdown allows for the embedding of R code within the document which can be used to draw figures or make tables from data. This can be very useful for describing or displaying aspects of data sets.

A very simple minimal example of the use of markdown to generate create metadata documentation can be found at the Living Norway Git Repository.

# Initialise the metadata from the R markdown file hosted at TODO
download.file("https://raw.githubusercontent.com/LivingNorway/LivingNorwayR/master/vignettes/LNWorkshopExample_TOV-E_Metadata.rmd",
  file.path(tempDirLoc, "LNWorkshopExample_Metadata.rmd"))
createdTOVEMetadata <- initializeDwCMetadata(file.path(tempDirLoc, "LNWorkshopExample_Metadata.rmd"),
  fileType = "rmarkdown" # This line is not required if the file has the ".rmd" or ".md" file extension
)
# Export the newly created metadata as an EML file
createdTOVEMetadata$exportToEML(file.path(tempDirLoc, "newMetadata.xml"))

and again the created metadata can be viewed in a browser using the following command:

browseURL(file.path(tempDirLoc, "newMetadata.xml"))

Putting it All Together

Now we have all the components that we need to package the data tables and the metadata together into one Darwin Core archive object. This can be done through calling the initializeDwCArchive function giving the core table as the first argument, a list of all the extension tables as the second argument, and a DwCMetadata object as the third argument.

newTOVEArchive <- initializeDwCArchive(newTOVEEventTable, list(newTOVEOccTable), newTOVEMetadata)

Finally, we can then export it as a to whichever location we wish to store it to.

newTOVEArchive$exportAsDwCArchive(file.path(tempDirLoc, "newDwCArchive.zip"))

This Darwin Core archive file can now serve as a useful interchange format that ensures that all the data and metadata are packaged together and that they are both described used known biodiversity standards. Thus satisfying the basic tenants of FAIR data sharing and allowing your data to be indexed by biodiversity databases such as GBIF.