[BEAM-10990] Elasticsearch response filtering and [BEAM-5172] Tries to reduce ES UTest flakiness#15381
Conversation
|
Run Java PreCommit |
|
Friendly bump. Anyone able to review? |
|
I'm sorry I haven't got time to complete a review, but I have read through the changes once, and it all looks well structured and has good attention to detail on comments, tests etc. The area I can't comment on immediately is the adapter and windowing behavior - if someone could confirm that section looks reasonable I think it looks good to merge. |
|
Ya I agree that maintaining a record of timestamps and using the context adapter is a lot of complexity. The concept was borrowed heavily from this RedisIO PR: #5841 I'm interested in investigating how much work would be involved in completing BEAM-1287 to allow |
|
Thanks @egalpin @echauchot - if you have time, could you please take a look at the section of changes in ElasticsearchIO from lines 2346 onwards? I'm just not familiar enough with the windowing to verify. The rest of the changes seem very reasonable. |
|
@timrobertson100 sure ! thanks for taking a look ! |
|
Thanks @timrobertson100 for having a look! @echauchot I’m definitely keen to understand if the strategy in the PR will work and its windowing implications, and would be happy to learn of alternative approaches! |
|
@egalpin sorry for the late review, taking a look now |
|
Thanks for having a look @echauchot! And no problem, I’ve been swamped with work and unable to address the flaky tests so I completely understand! |
|
Oh I had also forgotten about an alternative solution that would have less complexity, but I’m not sure how it might or might not adhere to best practices as a sink in the system. We could instead combine inputs globally and forget about maintaining windows altogether. I believe this is what the BQ Write method does, and I think that not maintaining windows in a sink is generally sane 🤷♂️ Thoughts? |
echauchot
left a comment
There was a problem hiding this comment.
Hi Evan, thanks a lot for your work !
I have some minor change requests around wording but a major concern with the complexity of having to deal with windows. It should be transparent to an IO. I think you can merge input doc and response globally and do not deal with windows.
|
Thanks for the feedback @echauchot! I wasn't sure if re-windowing within an IO was acceptable but it sounds like that's the path to go. It's way less complex, and I definitely like that. I'll make those changes |
@egalpin my pleasure ! I mean: you should not change the windows of the elements. In fact, you should not deal with the windows at all. Your problem is to join the input elements with the status (json and error) of the write. You could do:
|
|
Thanks for clarifying, that makes sense 👍 My original intent was to leave windows alone entirely and just output without modification. It’s worth noting that the window each element belongs to should be left unmodified, and that’s why the complexity is present. The challenge arose from the use of FinishBundle, where the output method requires explicit use of a BoundedWindow. So all of the window collection/multimap/adapter complexity is all done so that we could output any buffered elements when FinishBundle is called. If FinishBundle could accept a MultiOutput or OutputReceiver, I believe this would be solved neatly. I just didn’t want to gate this change on FinishBundle changes, but maybe that’s the right path? |
|
@egalpin I don't get it. Neither BulkIOBaseFn#finishBundle() nor ElasticsearchIO#flushbatch() expect windowing information. |
|
@echauchot It's highly possible that I have a fundamental misunderstanding about how to handle window data. I'm going to try to outline my goals and challenges, and take the proposed implementation out of focus. Goals:
Challenges:
I got a bit stuck on that last point. My impression was that in order to ensure buffered docs' results were output, and in order to leave those elements' windows unaltered, I needed to keep track of the windows to which those elements belong so that they could be explicitly passed to FinishBundleContext#output. I'd definitely be keen to learn more about how to handle windows and challenge my assumptions here. Thanks for your time @echauchot in reviewing and teaching. |
Just to write here what we discussed privately yesterday (Apache way: what did not happen publicly did not happen at all):
|
echauchot
left a comment
There was a problem hiding this comment.
Overall windowing management and flush* methods look good to me, please address the minor comments and we will be good to merge
|
Nice ! |
echauchot
left a comment
There was a problem hiding this comment.
Thanks for your work Evan ! Almost LGTM. One final thing about adding a test that covers the case that you indicated about PDone. Please also squash as explained, that way we will avoid a round trip and I could directly merge when the above UTest is done.
bea212b to
f7691e6
Compare
|
Pre commit failure seems to indicate that missing cache entry was the cause 🤔
I'll try re-running and then will dig in further if persistent. |
|
Run Java PreCommit |
echauchot
left a comment
There was a problem hiding this comment.
LGTM ! Perfect work and fluid communication as always, thanks Evan !
Merging
|
Thanks for your review efforts and insights @echauchot, it was a pleasure as always 🎉 |
|
I believe the flushing logic has a bug where we are outputting to the wrong window: https://issues.apache.org/jira/browse/BEAM-14064 |
|
Should be fixed by this PR |
Adds the ability to prevent infinite retries with non-transient Elasticsearch write failures by providing a user-configurable setting called (for now, I'm not certain the name is perfect)
withThrowWriteFailures, which istrueby default to maintain backward compatibility.If
withThrowWriteFailuresis set tofalse, the response from Elasticsearch Bulk API will be used to capture the result of persisting a document (or deleting it). The order of Bulk API response is guaranteed to be in the same order as the Bulk API request[1], so we can stitch together what the write result of a given input document was.This PR introduces a new class
WriteSummary. This class is used to collect the context of a document as it passes from raw input document (i.e. the PCollection being sent to ElasticsearchIO.Write/ElasticsearchIO.DocToBulk), the Bulk Directive which resulted from transform settings and the input document (ex. delete directive, upsert, scripted upsert, etc), as well as whether the document resulted in an error from ES. By maintaining all the context/ancestry of the document, users can get at the input document and its resulting error, for example. Without maintaining the ancestry, we could only return the Bulk Directive which would be less helpful in many cases.[1] https://discuss.elastic.co/t/ordering-of-responses-in-the-bulk-api/13264
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
ValidatesRunnercompliance status (on master branch)Examples testing status on various runners
Post-Commit SDK/Transform Integration Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.