-
Notifications
You must be signed in to change notification settings - Fork 531
IQSS/11777 improve MDC citation api scaling #11781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ofahimIQSS
merged 12 commits into
IQSS:develop
from
QualitativeDataRepository:IQSS/11777-improve_MDC_Citation_api_scaling
Oct 17, 2025
Merged
IQSS/11777 improve MDC citation api scaling #11781
ofahimIQSS
merged 12 commits into
IQSS:develop
from
QualitativeDataRepository:IQSS/11777-improve_MDC_Citation_api_scaling
Oct 17, 2025
+279
−32
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
|
@qqmyers Could you resolve the conflict. I'm not able to push the fix back. |
IQSS/11777-improve_MDC_Citation_api_scaling
stevenwinship
approved these changes
Sep 16, 2025
…DC_Citation_api_scaling
Contributor
|
looks good to me - merging |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
FY26 Sprint 5
FY26 Sprint 5 (2025-08-27 - 2025-09-10)
FY26 Sprint 6
FY26 Sprint 6 (2025-09-10 - 2025-09-24)
FY26 Sprint 7
FY26 Sprint 7 (2025-09-24 - 2025-10-08)
FY26 Sprint 8
FY26 Sprint 8 (2025-10-08 - 2025-10-22)
Size: 3
A percentage of a sprint. 2.1 hours.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it: This PR makes the /api/admin/makeDataCount/{id}/updateCitationsForDataset call asynchronous, adds a queue for serially processing requests to it, adds an optional minimal delay between calls to the DataCite api triggered by these calls (to avoid hitting their rate limit), and improves memory use for datasets with many files. Together, these can help avoid performance issues and failures when periodically updating citations for many datasets (see the counter_weekly.sh script and MDC guides info).
Which issue(s) this PR closes:
Special notes for your reviewer: The PR adds an executor service configured with a queue size of 1000 which is probably sufficient for most installations. For Harvard, you'd definitely want this PR, but you'll also either need to increate the queue size (to the number of datasets you want to check) or modify the counter_weekly.sh script to make calls in smaller batches or watch for 503 responses and throttle sending new requests.
Suggestions on how to test this: This would be somewhat tricky as the result of running the api call is to update citations for datasets and test datasets probably won't have any. You can run the counter_weekly.sh script on an instance with many datasets (up to 1k), verify that you get OK responses in the counter_weekly.sh logging/no errors in the Dataverse log etc.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?:added
Additional documentation: added, noted backward compatibility change as the call is now async.