Skip to content

DataShades/ckanext-doi

 
 

Repository files navigation

The Natural History Museum logo.

ckanext-doi

Tests Coveralls CKAN Python Docs

A CKAN extension for assigning a digital object identifier (DOI) to datasets, using the DataCite DOI service.

Overview

This extension assigns a digital object identifier (DOI) to datasets, using the DataCite/Crossref DOI service.

When a new dataset is created it is assigned a new DOI. This DOI will be in the format:

https://doi.org/[prefix]/[8 random alphanumeric characters]

If the new dataset is active and public, the DOI and metadata will be registered with DataCite/Crossref.

If the dataset is draft or private, the DOI will not be registered with DataCite/Crossref. When the dataset is made active & public, the DOI will be registered. This allows datasets to be embargoed, but still provides a DOI to be referenced in publications.

You will need a DataCite/Crossref account to use this extension.

DOI Metadata

DataCite currently uses DataCite Metadata Schema v4.5. Crossref currently uses Crossref Metadata Schema v5.3.1.

Dataset package fields and CKAN config settings are mapped to the DataCite Schema with default values, but these can be overwritten by implementing IDoi interface methods.

Required fields

CKAN Field DataCite Schema
dataset:title title
dataset:author creator
config:ckanext.doi.publisher publisher
dataset:metadata_created.year publicationYear
dataset:type resourceType

See metadata.py for full mapping details.

CKAN Field Crossref Schema
dataset:title title
config:ckanext.doi.publisher publisher
config:ckanext.doi.account_name email_address

Installation

Path variables used below:

  • $INSTALL_FOLDER (i.e. where CKAN is installed), e.g. /usr/lib/ckan/default
  • $CONFIG_FILE, e.g. /etc/ckan/default/development.ini

Installing from PyPI

pip install ckanext-doi

Installing from source

  1. Clone the repository into the src folder:

    cd $INSTALL_FOLDER/src
    git clone https://github.com/NaturalHistoryMuseum/ckanext-doi.git
  2. Activate the virtual env:

    . $INSTALL_FOLDER/bin/activate
  3. Install via pip:

    pip install $INSTALL_FOLDER/src/ckanext-doi

Installing in editable mode

Installing from a pyproject.toml in editable mode (i.e. pip install -e) requires setuptools>=64; however, CKAN 2.9 requires setuptools==44.1.0. See our CKAN fork for a version of v2.9 that uses an updated setuptools if this functionality is something you need.

Post-install setup

  1. Add 'doi' to the list of plugins in your $CONFIG_FILE:

    ckan.plugins = ... doi
  2. Upgrade the database to create the tables:

    ckan -c $CONFIG_FILE db upgrade -p doi
  3. This extension will only work if you have signed up for an account with DataCite/Crossref. You will need a development/test account to use this plugin in test mode, and a live account to mint active DOIs.

Configuration

These are the options that can be specified in your .ini config file.

DateCite Credentials [REQUIRED]

DataCite Repository account credentials are used to register DOIs. A Repository account is administered by a DataCite Member.

Name Description Example
ckanext.doi.account_name Your DataCite/Crossref Repository account name ABC.DEFG
ckanext.doi.account_password Your DataCite/Crossref Repository account password
ckanext.doi.prefix The prefix taken from your DataCite/Crossref Repository account (from your test account if running in test mode) 10.1234

Institution Name [REQUIRED]

You also need to provide the name of the institution publishing the DOIs (e.g. Natural History Museum).

Name Description
ckanext.doi.publisher The name of the institution publishing the DOI

Platform Name [OPTIONAL]

You also need to provide the name of the platform for publishing the DOIs (e.g. crossref or datacite). By default it is set to datacite.

Name Description
ckanext.doi.platform The name of the platform for publishing the DOI

Disable DOI minting to Dataset update [OPTIONAL]

Disable the minting trigger while updating Dataset. Needed for more custom ways of triggering DOI minting process.

Name Description
ckanext.doi.disable_on_update True/False

Test/Debug Mode [REQUIRED]

If test mode is set to true, the DOIs will use the DataCite/Crossref test site. The test site uses a separate account, so you must also change your credentials and prefix.

Name Description Options
ckanext.doi.test_mode Enable dev/test mode True/False

Note that the DataCite DOIs will still display on your web interface as https://doi.org/YOUR-DOI, but they will not resolve. Log in to your test account to view all your minted test DOIs, or replace https://doi.org/ with https://doi.test.datacite.org/dois/ in a single URL to view a specific DOI.

Crossref TBA

Other options

Name Description Default
ckanext.doi.site_url Used to build the link back to the dataset ckan.site_url
ckanext.doi.site_title Site title to use in the citation None

Usage

Commands

doi

  1. delete-dois: delete all DOIs from the database (not datacite).

    ckan -c $CONFIG_FILE doi delete-dois
  2. update-doi: update the datacite metadata for one or all packages.

    ckan -c $CONFIG_FILE doi update-doi [PACKAGE_ID]

Interfaces

The IDoi interface allows plugins to extend the build_metadata_dict and build_xml_dict methods.

build_metadata_dict(pkg_dict, metadata_dict, errors)

Breaking changes from v1:

  1. previously called build_metadata
  2. new parameter: errors
  3. new return value: tuple of metadata_dict, errors

Extracts metadata from a pkg_dict for use in generating datacite DOIs. The base method from this extension is run first, then the metadata dict is passed through all the implementations of this method. After running these, if any of the required values (see above) are still in the errors dict (i.e. they still could not be handled by any other extension), a DOIMetadataException will be thrown.

Parameter Description
pkg_dict The original package dictionary from which the metadata were extracted.
metadata_dict The current metadata dict, created by the ckanext-doi extension and any previous plugins implementing IDoi.
errors A dictionary of metadata keys and errors generated by previous plugins; this method should remove any keys that it successfully processes and overwrites.

build_xml_dict(metadata_dict, xml_dict)

Breaking changes from v1:

  1. previously called metadata_to_xml
  2. parameters rearranged (previously xml_dict, metadata)

Converts the metadata_dict into an xml_dict that can be passed to the datacite library's schema45.tostring() and schema45.validate() methods. The base method from this extension is run first, then the xml dict is passed through all the implementations of this method.

Parameter Description
metadata_dict The original metadata dictionary from which the xml attributes are extracted.
xml_dict The current xml dict, created by the ckanext-doi extension and any previous plugins implementing IDoi.

Templates

Package citation snippet

{% snippet "doi/snippets/package_citation.html", pkg_dict=g.pkg_dict %}

Resource citation snippet

{% snippet "doi/snippets/resource_citation.html", pkg_dict=g.pkg_dict, res=res %}

Testing

There is a Docker compose configuration available in this repository to make it easier to run tests. The ckan image uses the Dockerfile in the docker/ folder.

To run the tests can be run against ckan 2.9.x and 2.10.x on Python3:

  1. Build the required images:

    docker compose build
  2. Then run the tests. The root of the repository is mounted into the ckan container as a volume by the Docker compose configuration, so you should only need to rebuild the ckan image if you change the extension's dependencies.

    # run tests against ckan 2.9.x
    docker compose run latest
    
    # run tests against ckan 2.10.x
    docker compose run next

Note that the tests mock the DataCite API and therefore don't require an internet connection nor your DataCite credentials to run.

About

A CKAN extension for assigning a digital object identifier (DOI) to datasets, using the DataCite DOI service.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Python 91.6%
  • HTML 7.7%
  • Mako 0.7%