The resources below provide key links to get started and to find additional information on mobilisation biodiversity data using Darwin Core and the Integrated Publishing Toolkit.
Data standards and formats
Name | Description | Resources |
---|---|---|
Darwin Core (DwC) | DwC provides the primary standard for mobilising biodiversity data in the GBIF network. It is used to describe the occurrence of organsims in nature as recorded by observations, specimens and samples. DwC is a standard maintained by Biodiversity Standards International (TDWG). | Darwin Core Standard DarwinCore Quick Reference Guide |
Darwin Core Archive (DwC-A) | DwC-A is the main format used to publish biodiversity data in the GBIF network. It is star-schema archive (ZIP) that contains a set of files that includes a metadata file, a descriptor file that defines the structure and relationship of the data files, and one or more data files in TSV and/or CSV format. | GBIF page on Darwin Core and archives. |
Ecological Metadata Language (EML) | EML is used in a Darwin Core Archive to record the metadata for the published resource. EML provides a vocabulary for documenting research data. | EML Standard |
Comma Separated Values (CSV) | CSV files are used for some data files within a Darwin Core Archive. CSV is a commonly used format that uses a comma to separate values within a record, and line breaks (CRLF) to separate records. Care must be taken when using CSV with biodiversity data as many fields may contain commas, line breaks and other characters that can result in poorly formed CSV. |
IANA rfc4180
W3C Model for Tabular Data |
Tab Separated Values (TSV) | TSV files are used for some data files within a Darwin Core Archive. TSV is a commonly used format that uses the tab character to separate values within a record. TSV is often preferred over CSV because the field separator of TSV (tab) is less likely to contained in the data than comma used by CSV. |
W3C Model for Tabular Data
IANA Media Type |
Tools and services that assist data mobilisaton
Name | Purpose | Scope |
---|---|---|
Integrated Publishing Toolkit (IPT) | Mobilisation | IPT is a free, open source software tool that is used to publish biodiversity datasets to the GBIF network. |
GBIF Data Validator | Format validation | Validates a DarwinCore-Archive file. |
Darwin Core Archives Examples | Templates for familiarisation | Example spreadsheet templates for occurrence, checklist and sampling-event datasets. |
New Zealand Organisms Register (NZOR) | Data validation | A list of the names of organisms relevant to New Zealand. In addition to the website, NZOR provides a Matching service that can be used to validate names. |