There are several steps in preparing and mobilising your data sets. The sections below provide only a brief overview of these steps and aims at assisting you to locate the comprehensive guides prepared by GBIF and are available via their website. If you require advice or assistance on any part of mobilising your data please contact us.
Identify datasets
A first key step to mobilising your data is to identify which biodiversity datasets you can mobilise through GBIF. By working through each dataset you will have information accessible for the subsequent steps. It will also make it easier to reach out to us for help should you need any.
When identifying datasets there are several things you should consider. Key considerations:
- Is the dataset relevant to GBIF’s scope and objectives?
- Does the dataset fit one of the classes of dataset accepted by GBIF? For digitised resources, these are based around a central core, being one of Occurrence, Checklist or Event. For undigitised resources it is possible to create ‘metadata-only’ dataset.
- What are the Intelluctual Property rights on the dataset?
- Do you have the authority to publish the data? Or will the publication of the dataset require prior approval from another person(s) and/or organisation(s)?
- Is the dataset able to be published under a CC0, CC BY or CC BY-NC license?
- Should the dataset be published with open access?
- Does the dataset contain sensitive or personal information?
- Could some data be generalised or removed to protect sensitive components whilst still maintaining value to data users?
Secure institutional agreements
Once you decide to share data through the GBIF network, and have identified suitable datasets, you will need obtain the necessary agreements from your institution to publish the data on behalf of the institution.
GBIF only accepts data published by organisations, therefore individuals wishing to publish data need to work with an affiliated organisation or consider submitting a data paper to an appropriate journal.
Register as a data publisher
To be able to publish data to GBIF your organisation needs to register as a data publisher. This involves reviewing the data publisher agreement then completing an online registration form and receiving an endorsement from a Participant Node – for organisations within Aotearoa New Zealand the request for endorsement will normally be sent to GBIF-NZ. Endorsement guidelines are available on the GBIF website that explain the rationale and criteria used to endorse data publishers.
The online registration form asks for information such as your contact details, what type and how you plan on mobilising the datasets, and whether you require assistance to publish your data.
Publish your data
To publish data to the GBIF network, data holders will need to prepare their dataset(s) and accompanying metadata, and set up a publishing process.
The most common method of publishing data is to use GBIF’s Integrated Publishing Toolkit (IPT). Once setup, IPT provides easy to follow webpages to help you create metadata for each resource, map your data to Darwin Core (and other relevant standards when required), and to publish your data to a Darwin-Core Archive that can be harvested and processed by GBIF.
Other publishing arrangements are also available, such as having your data hosted or using the GBIF API.
Preparation of a dataset requires mapping the dataset to the appropriate standard (usually Darwin Core). Depending on the structure of your data, this may require multiple steps, and even content editing, before the data can be mapped to the standard.
GBIF has many resources avaliable online to guide you on mapping publishing your data. We have highlighted a selection of these in Standards and tools.
Maintaining your dataset
When preparing your data set for publication you should also consider how it will be maintained over time. For example,
- Will there be a need to update the accommpanying metadata (e.g., contact details)?
- Will the dataset be static (i.e., no new records added) once it has been published? Or will new or update records be added over time?
- How will you respond to any data quality issues that are raised by data users or which are identified through data processing?
- How will you respond if there are changes to the data standards, particularly if they enhance the quality and utility of your data?