feat(#10794): adds replication failure log#10823
feat(#10794): adds replication failure log#10823dianabarsan wants to merge 26 commits intomasterfrom
Conversation
Signed-off-by: Diana Barsan <[email protected]>
Signed-off-by: Diana Barsan <[email protected]>
# Conflicts: # api/tests/mocha/controllers/replication-limit-log.spec.js # api/tests/mocha/controllers/users.spec.js
Signed-off-by: Diana Barsan <[email protected]>
… log controller Signed-off-by: Diana Barsan <[email protected]>
Signed-off-by: Diana Barsan <[email protected]>
Signed-off-by: Diana Barsan <[email protected]>
# Conflicts: # api/src/middleware/authorization.js # api/src/services/replication/bulk-docs.js # api/tests/mocha/controllers/users.spec.js # api/tests/mocha/middleware/authorization.spec.js # tests/integration/api/controllers/replication.spec.js
Signed-off-by: Diana Barsan <[email protected]>
… for replication failure log service Signed-off-by: Diana Barsan <[email protected]>
# Conflicts: # tests/integration/api/controllers/replication.spec.js
Signed-off-by: Diana Barsan <[email protected]>
…ctionality Signed-off-by: Diana Barsan <[email protected]>
… comments for nouveau behavior in k3d Signed-off-by: Diana Barsan <[email protected]>
Signed-off-by: Diana Barsan <[email protected]>
Signed-off-by: Diana Barsan <[email protected]>
|
I know your review queue is long, but i have to ask for another one. I appreciate the time you spend on this review. |
jkuester
left a comment
There was a problem hiding this comment.
Very nice! I left a bunch of random thoughts inline.
Have you thought at all about if we should report a total value of these failures via the /monitoring api? Maybe even just a count of the distinct users that have replication failures.
a million times yes!! But also, I can wait 'til another iteration if it delays this PR/is too much feature creep. |
Signed-off-by: Diana Barsan <[email protected]>
Signed-off-by: Diana Barsan <[email protected]>
Signed-off-by: Diana Barsan <[email protected]>
Signed-off-by: Diana Barsan <[email protected]>
…d enhance pagination with cursor validation Signed-off-by: Diana Barsan <[email protected]>
Signed-off-by: Diana Barsan <[email protected]>
Signed-off-by: Diana Barsan <[email protected]>
|
Hi @jkuester A bunch of changes occured, mostly as part of your review feedback:
I hope I didn't miss anything. Review at your discretion. Appreciate your time and thoughts! |
jkuester
left a comment
There was a problem hiding this comment.
Couple minor suggestions/thoughts, but this is good to go! 👍 LGTM
| app.get( | ||
| '/api/v1/initial-replication/get-ids', | ||
| authorization.handleAuthErrors, | ||
| authorization.onlineUserPassThrough, | ||
| replication.getDocIds, | ||
| ); |
| * summary: Get replication failure logs | ||
| * operationId: v1ReplicationFailureLogsGet | ||
| * description: > | ||
| * Returns a paginated page of full replication failure log documents. Optionally filter by `user` |
There was a problem hiding this comment.
| * Returns a paginated page of full replication failure log documents. Optionally filter by `user` | |
| * Returns a page of replication failure log documents. Optionally filter by `user` |
| // Counts are only set on userCtx as each phase of the request completes. When a count is missing | ||
| // we record 'unknown' instead of omitting the key, so a stable shape on the failure entry tells you | ||
| // (by which counters are 'unknown') how far the request progressed before failing. |
There was a problem hiding this comment.
Don't have to change anything here, but setting unknown in these cases does not seem to make it any easier for consumer code to parse the data from the doc when compared to setting:
"",nullundefined
I kind of understand wanting to keep the object shape "stable", but I am not really sure how much value there is in that when we don't have any meaningful default value to use... 🤷
| const oldestPeriod = await findOldestReportingPeriod(); | ||
| if (!oldestPeriod) { | ||
| return { data: [], cursor: null }; | ||
| } | ||
|
|
||
| // Username is the doc-id suffix, preventing prefix-scan; bulk-fetch the user's candidate keys (max 60). | ||
| const candidateKeys = enumerateReportingPeriods(oldestPeriod).map(period => getDocId(user, period)); | ||
| const result = await db.medicLogs.allDocs({ keys: candidateKeys, include_docs: true }); |
There was a problem hiding this comment.
Is there a strong reason we cannot just index these docs by user? This is kind of a crazy amount of logic to have/maintain (even if the performance is okay). It is fancy that all of this actually works, but at the same time it really smells like a premature optimization.
| const getUserFailureLog = async (username) => { | ||
| const response = await getFailureLogs({ user: username }); | ||
| const currentPeriodId = getFailureLogId(username); | ||
| return response.data.find(log => log._id === currentPeriodId) || response.data[0] || null; |
There was a problem hiding this comment.
Minor, but the currentPeriodId logic here feels like surprizing behavior that might be confusing to debug later. (When we call this function are we expecting a doc for the current period or are we expecting to just get back the first one? How can someone reading the code know?) Would rather just have two dedicated functions...

Description
replicationdedicated folder. The list became huge!/api/v1/replication/get-ids, matching the existing setup on the initial-replication routecloses #10794
AI Disclosure
This is developed with the help of Claude. I have made all design decisions, reviewed and corrected all code.
Code review checklist
can_view_old_navigationpermission to see the old design. Test it has appropriate design for RTL languages.Compose URLs
If Build CI hasn't passed, these may 404:
License
The software is provided under AGPL-3.0. Contributions to this project are accepted under the same license.