Skip to content

Python script for calling the Digital Public Library of America (DPLA) API and downloading JSON-LD data. Currently configured to download data based on collection titles

License

Notifications You must be signed in to change notification settings

jlynch2121/dpla-api-json-downloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

dpla-api-json-downloader

This calls the DPLA-API, using provider titles provided in a CSV and downloads JSON-LD metadata for those providers.

For best results, provider titles listed in the CSV should include plus signs ('+') instead of spaces between words, i.e. 'dataProvider+title' rather than 'dataProvider title'.

To run this script, provide appropriate values for variables near beginning of program, including:

  • folder name into which JSON data will be saved
  • API key for querying the DPLA API
  • Fields you want provided
  • ID of a Service Hub
  • Name of the csv file with provider titles. A csv including all your hub's provider names can be downloaded from the DPLA Analytics Dashboard.

Run script in same directory as the csv file with provider names.

Note: for the API request, it is not necessary to include 'provider.@id' but this helps to insure data is from the correct service hub in case there are multiple entities of the same name across DPLA's data. The default request will provide:

  • sourceResource.title: the title of the item
  • id: the items DPLA ID
  • isShownAt: the item's local ID dataProvider: the contributing institution
  • sourceResource.collection.title: the title of the collection of which the record is a part
  • originalRecord.type: Deprecated. original Type metadata value harvested by the DPLA. After Ingestion 3, entire original records are provided as a single string and code for extracting individual fields and values will depend on the format of the hub's original records
  • sourceResource.type: Type metadata value created by the DPLA which should appear in DPLA search interfaces

Script was built and run on a machine with Python 3.7. Has not been tested on earlier versions of Python

About

Python script for calling the Digital Public Library of America (DPLA) API and downloading JSON-LD data. Currently configured to download data based on collection titles

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages