Feature Request: Improve MDC Citation API scaling

**Overview of the Feature Request**
The current /api/admin/makeDataCount/{id}/updateCitationsForDataset endpoint, which is often called in batch mode for all datasets in quick succession, does not handle queueing of requests, or throttling to match DataCite's rate limiting. It also reads the whole event report into memory before processing which is problematic for datasets with many files (as most of the report is the hasPart relationships we ignore in this call). For use with larger datasets, and use on larger instances, this should be improved.

**What kind of user is the feature intended for?**
(Example users roles: API User, Curator, Depositor, Guest, Superuser, Sysadmin)


**What inspired the request?**
QDR seeing site performance issues during the weekly cron job running the counter_weekly.sh script (which calls this api for all datasets).

**What existing behavior do you want changed?**


**Any brand new behavior do you want to add to Dataverse?**


**Any open or closed issues related to this feature request?**

**Are you thinking about creating a pull request for this feature?**  
Help is always welcome, is this feature something you or your organization plan to implement?
PR in progress.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Improve MDC Citation API scaling #11777

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Improve MDC Citation API scaling #11777

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions