The last post on April 2, 2020, explained how a human user can find CMIP6 data citations. For use cases where the data citation information was not stored during ESGF data download and many datasets have been analyzed, a script-based data citation access is required.
There are several APIs available at DKRZ, which are documented at https://www.wdc-climate.de/ui/cmip-api-docs/, and one provided by DataCite:
1. Citation Search API
In addition to the Citation Search GUI for human users, the Citation Search API provides flexible machine-access to selected CMIP6 data citations in JSON format. The response contains all components of a data reference and data use information:
http://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6search
Available filter options:
- filter by DRS:
- mipEra
- activityId
- institutionId
- sourceId
- experimentId
- filter by granularity:
- granularity=[exp|model]
- filter by date (in ISO 8601 format):
- gePublicationDate=YYYY-MM-DD: DOI published at or after a given date
- lePublicationDate=YYYY-MM-DD: DOI published before or at a given date
Sample Calls:
- Data references on experiment (fine) granularity for a given source_id and activity_id:
http://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6search?mipEra=CMIP6&activityId=CMIP&sourceId=HadGEM3-GC31-MM&granularity=exp - Update data references of request 1. with data references published at or after a given date:
http://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6search?mipEra=CMIP6&activityId=CMIP&sourceId=HadGEM3-GC31-MM&granularity=exp&gePublicationDate=2020-01-01 - Data references on model/MIP (coarse) granularity contributing to an activity_id available at a given snap-shot date:
http://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6search?mipEra=CMIP6&activityId=ScenarioMIP&granularity=model&lePublicationDate=2020-03-31
2. Direct access using DRS_id
The content of the CMIP6 DOI landing pages is provided in two additional machine-readable formats: JSON and XML. The underlying metadata standard is that of DataCite 4 (see documentation: https://doi.org/10.14454/7xq3-zf69; schema definition: http://schema.datacite.org/meta/kernel-4/metadata.xsd):
http://cera-www.dkrz.de/WDCC/meta/CMIP6/
<mip_era>.<activity_drs>.<institution_id>.<source_id>[.<experiment_id>].[json|xml]
For possible values of the DRS (Data Reference Syntax) components, please check the CMIP6 Controlled Vocabulary at:
https://github.com/WCRP-CMIP/CMIP6_CVs.
Example calls for json format:
a. Model/MIP granularity:
http://cera-www.dkrz.de/WDCC/meta/CMIP6/CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.json
b. Experiment granularity:
http://cera-www.dkrz.de/WDCC/meta/CMIP6/CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.json
It is possible to use the ESGF Search API to collect these JSON urls for the 'experiment granularity' from the ESGF index. These 'citation_url's are part of every dataset information. More information on the ESGF Search API is available at: https://esgf.github.io/esg-search/ESGF_Search_RESTful_API.html
3. API to list data citations based on DRS components
A list of available CMIP6 data citations in a simple JSON response can be requested via an API:https://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6Citations
Available attributes are combined as logical AND: institutionId, sourceId, complete (true|false), drsId.
- Search for all data references for an institution_id:
https://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6Citations?institutionId=CNRM-CERFACS - Search for all data references for a source_id from an institution_id:
https://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6Citations?institutionId=CNRM-CERFACS&sourceId=CNRM-ESM2-1 - Search for all completed data references for an institution_id:
https://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6Citations?institutionId=CNRM-CERFACS&complete=true - Search for a specific data reference by drs_id:
https://cera-www.dkrz.de/WDCC/ui/cerasearch/cerarest/cmip6Citations?drsId=CMIP6.CMIP.CNRM-CERFACS.CNRM-CM6-1
4. DataCite RestAPI to list data citations based on DRS components
- Look-up of data citation information for a given DOI:
https://api.datacite.org/dois/<DOI>
https://api.datacite.org/dois/10.22033/ESGF/CMIP6.12104 - Search through all CMIP6 DOIs to extract information for specific DRS_ids:
This requires a two step approach of
- Access of all CMIP6 DOIs:
https://api.datacite.org/dois?query=publisher:Earth%20System%20Grid%20Federation - Search through the entries in the JSON response to identify them by their DRS under 'attributes/subjects/subject' with subjectScheme='DRS', e.g.
"id": "10.22033/esgf/cmip6.14633", "type": "dois", "attributes": { "doi":"10.22033/esgf/cmip6.14633", "identifiers":[], "creators":[…], "titles":[…], "publisher":"Earth System Grid Federation", "container":{…}, "publicationYear":2021, "subjects":[ 0{…}, 1{…}, 2{ "subject":"CMIP6.PAMIP.MOHC.HadGEM3-GC31-MM.pa-futAntSIC", "subjectScheme":"DRS" }], ...
References and Links:
CMIP6 Citation Service: https://cmip6cite.wdc-climate.de
CMIP6: https://pcmdi.llnl.gov/CMIP6/
CMIP6 Registration/CV: https://github.com/WCRP-CMIP/CMIP6_CVs
DKRZ API documentation: https://www.wdc-climate.de/ui/cmip-api-docs/
DataCite: https://datacite.org
DataCite API documentation: https://support.datacite.org/docs/api