Zenodo API – Learning

This week, I worked with the Zenodo API. My goals was to add Zenodo metadata to GitHub for the partypositions-wikitags project.

The code of the project is archived at Zenodo with the DOI 10.5281/zenodo.7043510.

Code

import requests

headers = {"accept": "text/x-bibliography"}
r = requests.get("https://doi.org/10.5281/zenodo.7043510", headers=headers)

r.text

'Döring, H., &amp; Herrmann, M. (2025). <i>Party positions from Wikipedia tags</i> (Version 25.07) [Computer software]. Zenodo. https://doi.org/10.5281/ZENODO.7043510'

DOI API

So here is some metadata information. Lets start with the DOI API.

import requests

doi = "10.5281/zenodo.7043510"

headers = {"accept": "application/vnd.citationstyles.csl+json"}
r = requests.get(f"https://doi.org/{doi}", headers=headers)

r.json()

{'type': 'book',
 'id': 'https://doi.org/10.5281/zenodo.7043510',
 'language': 'en',
 'author': [{'family': 'Döring', 'given': 'Holger'},
  {'family': 'Herrmann', 'given': 'Michael'}],
 'issued': {'date-parts': [[2025, 7, 31]]},
 'abstract': 'Estimation of party positions from Wikipedia tags with Stan',
 'DOI': '10.5281/ZENODO.7043510',
 'publisher': 'Zenodo',
 'title': 'Party positions from Wikipedia tags',
 'URL': 'https://zenodo.org/doi/10.5281/zenodo.7043510',
 'copyright': 'MIT License',
 'version': '25.07'}

Zenodo record

We can also access a record through the Zenodo API. This does not require a Zenodo access token.

r = requests.get("https://zenodo.org/api/records/8275697")
record = r.json()

record.keys()

dict_keys(['created', 'modified', 'id', 'conceptrecid', 'doi', 'conceptdoi', 'doi_url', 'metadata', 'title', 'links', 'updated', 'recid', 'revision', 'files', 'swh', 'owners', 'status', 'stats', 'state', 'submitted'])

I was interested in the Zenodo metadata.

record['metadata']

{'title': 'Party positions from Wikipedia tags (July 2023)',
 'doi': '10.5281/zenodo.8275697',
 'publication_date': '2023-08-23',
 'description': '<p>Estimation of party positions from Wikipedia tags with Stan (July 2023)</p>',
 'access_right': 'open',
 'creators': [{'name': 'Holger Döring',
   'affiliation': 'GESIS – Leibniz Institute for the Social Sciences'},
  {'name': 'Michael Herrmann', 'affiliation': 'University of Konstanz'}],
 'related_identifiers': [{'identifier': 'https://github.com/hdigital/partypositions-wikitags/tree/23.07',
   'relation': 'isSupplementTo',
   'scheme': 'url'}],
 'version': '23.07',
 'resource_type': {'title': 'Software', 'type': 'software'},
 'license': {'id': 'other-open'},
 'relations': {'version': [{'index': 1,
    'is_last': False,
    'parent': {'pid_type': 'recid', 'pid_value': '7043510'}}]}}

GitHub metadata

Some of the metadata is imported by Zenodo from Github and some metadata needs to be added or updated manually.

You can specify some of the additional metadata in a .zenodo.json file in your GitHub repository.

https://developers.zenodo.org/#github

I used the archived Zenodo record for a json dump of the metadata.

import json

print(json.dumps(record['metadata'], indent=2))

{
  "title": "Party positions from Wikipedia tags (July 2023)",
  "doi": "10.5281/zenodo.8275697",
  "publication_date": "2023-08-23",
  "description": "<p>Estimation of party positions from Wikipedia tags with Stan (July 2023)</p>",
  "access_right": "open",
  "creators": [
    {
      "name": "Holger D\u00f6ring",
      "affiliation": "GESIS \u2013 Leibniz Institute for the Social Sciences"
    },
    {
      "name": "Michael Herrmann",
      "affiliation": "University of Konstanz"
    }
  ],
  "related_identifiers": [
    {
      "identifier": "https://github.com/hdigital/partypositions-wikitags/tree/23.07",
      "relation": "isSupplementTo",
      "scheme": "url"
    }
  ],
  "version": "23.07",
  "resource_type": {
    "title": "Software",
    "type": "software"
  },
  "license": {
    "id": "other-open"
  },
  "relations": {
    "version": [
      {
        "index": 1,
        "is_last": false,
        "parent": {
          "pid_type": "recid",
          "pid_value": "7043510"
        }
      }
    ]
  }
}

Then, I manually cleaned up the metadata:

kept only entries that are not imported automatically
used unicode characters for umlaute
specified license id

metadata_json = """
{
  "title": "Party positions from Wikipedia tags",
  "description": "Estimation of party positions from Wikipedia tags with Stan",
  "creators": [
    {
      "name": "Döring, Holger",
      "affiliation": "GESIS – Leibniz Institute for the Social Sciences"
    },
    {
      "name": "Herrmann, Michael",
      "affiliation": "University of Konstanz"
    }
  ]
}
"""

Here, I validate the json. Running the cell should not raise an error.

metadata_json = json.loads(metadata_json)

Finally, I added it to the GitHub repository – commit ac7c462 😊