Buffer append output results + fix extra incorrect results#1283
Buffer append output results + fix extra incorrect results#1283beriaanirudh wants to merge 15 commits intoapache:masterfrom
Conversation
| * could be called in PENDING state as well | ||
| */ | ||
| if ($scope.paragraph.id === data.paragraphId && | ||
| ($scope.paragraph.status === 'RUNNING' || $scope.paragraph.status === 'PENDING')) { |
There was a problem hiding this comment.
Considering the Comment above, why would we accept PENDING state?
There was a problem hiding this comment.
What I observed was that in between the pending and the running state, some (I observed 1) append-output events were called. If we miss those, then while the paragraph is running, we would see that some intial-result line/s is/are missing.
Also, when the para execution is finished, the complete note is sent again (with results), so we would see the correct result.
There was a problem hiding this comment.
I see, the comment was a bit misleading, as it seems it was saying that it could be errorneously call es as well in PENDING state.
There was a problem hiding this comment.
Ya, when I re-read it sounds that way.. Re-phrased it.
| + " and send paragraph append data"); | ||
| thread = new Thread(new AppendOutputRunner()); | ||
| thread.start(); | ||
| } |
There was a problem hiding this comment.
That's not thread-safe. You should use synchronized or something else
|
I think you'd better use |
|
There're some |
|
It is easier to handle the twice output on the javascript side, so I have added the check in file |
1. Synchronize on AppendOutputRunner creation 2. Use ScheduledExecutorService instead of while loop 3. Remove Thread.sleep() from tests
|
@jongyoul: I have incorported all feedback:
|
| prepareInvocationCounts(listener); | ||
| AppendOutputRunner.setListener(listener); | ||
| CheckAppendOutputRunner.startRunnerForUnitTests(); | ||
| while(numInvocations != numTimes); |
There was a problem hiding this comment.
I suggest you add Circuit Breaker with few seconds to avoid running infinitely.
There was a problem hiding this comment.
Done.. Added a check for to fail the tests after 2 seconds.. Would have to add CircuitBreaker in dependencies, so just using System times.
|
My tests needed some fixes to work with other tests. I have fixed them. But the build still has some errors I think not related to my change. https://travis-ci.org/apache/zeppelin/builds/150876902 . |
|
@jongyoul @corneadoug I have incorporated all feedback, and added comments on queries. This PR is ready for review. |
|
I've a few comment on your comments. Please check it. |
|
Tested a few cases to check that the front-end still handle most of the cases, and its good |
| private static final Long SAFE_PROCESSING_STRING_SIZE = new Long(100000); | ||
|
|
||
| private static final BlockingQueue<AppendOutputBuffer> QUEUE = | ||
| new LinkedBlockingQueue<AppendOutputBuffer>(); |
There was a problem hiding this comment.
This dynamic data structure does not feel like a constant, so accouding to the project styleguide#s5.2.4-constant-names) better be rather named queue.
|
@corneadoug thanks for verifying. |
|
@corneadoug @jongyoul @bzz this is ready for review. |
| public void run() { | ||
|
|
||
| Map<String, Map<String, StringBuilder> > noteMap = | ||
| new HashMap<String, Map<String, StringBuilder> >(); |
There was a problem hiding this comment.
Probably a nitpick, but 'diamond operator' should come handy
Map<String, Map<String, StringBuilder> > noteMap = new HashMap<>();|
@corneadoug @jongyoul @bzz this is ready for review. |
|
@beriaanirudh You need to change more where you don't use 'diamond operator'. Can you fix them? |
|
@jongyoul done. Used diamond operator at all places except for 1 where compiler wouldn't allow (AppendOutputRunner.java:78) |
|
@beriaanirudh thank you for prompt fixes! |
|
+1 Looking forward to this |
|
LGTM |
|
@corneadoug @bzz this is ready for review... |
|
The behaviour is still the same with: |
|
@corneadoug yes, that streams output every one seconds if i try the master branch. However, I did some more experiments and found Output streaming : master branch (X), this pullrequest (X) Output streaming : master branch (O), this pullrequest (X) @corneadoug @beriaanirudh Could you verify these cases, too? |
|
I tested both cases: However the first time I tried this branch and commented, the Case 2 was fine in this Branch |
|
According to the test results, %sh interpreter output streaming might be other issue. @beriaanirudh Could you take care of it? |
|
Hey, I think SparkInterpreter could have been broken which got fixed today in |
|
Don't know why sometimes we get some failure in that branch, Case1: master (X), this branch (X) |
|
Thanks for the verification @corneadoug . Even I am not able to reproduce it. My best guess is that it had to do something with the fix I mentioned above. |
|
@corneadoug @Leemoonsoo please let me know if any action-item on my side. |
|
Tested again and got Case2 works fine on this branch. Thanks @beriaanirudh for the contribution! |
There are 2 issues and their proposed fixes: 1. On a paragraph run, for every line of output, there is a broadcast of the new line from zeppelin. In case of thousands of lines of output, the browser/s would hang because of the volume of these append-output events. 2. In the above case, besides the browser-hang, another bug observed is that result data is will repeated twice (coming from append-output calls + finish-event calls). The proposed solution for apache#1 is: - Buffer the append-output event into a queue instead of sending the event immediately. - In a separate thread, read from the queue periodically and send the append-output event. Solution for apache#2 is: - Donot append output to result if the paragraph is not runnig. Improvement + Bug Fix https://issues.apache.org/jira/browse/ZEPPELIN-1292 The test could be to run a simple paragraph with large result. Eg: ``` %sh for i in {1..10000} do echo $i done ``` PS: One will need to clear browser cache between running with and without this code patch since there are javascript changes as well. * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? It could need for the design. Otherwise I have added code comments explaining behaviour. Author: Beria <[email protected]> Closes apache#1283 from beriaanirudh/ZEPPELIN-1292 and squashes the following commits: 17f0524 [Beria] Use diamond operator 7852368 [Beria] nit 4b68c86 [Beria] fix checkstyle d168614 [Beria] Remove un-necessary class CheckAppendOutputRunner 2eae38e [Beria] Make AppendOutputRunner non-static 72c316d [Beria] Scheduler service to replace while loop in AppendOutputRunner 599281f [Beria] fix unit tests that run after dd24816 [Beria] Add license in test file 3984ef8 [Beria] fix tests when ran with other tests 1c893c0 [Beria] Add licensing 1bdd669 [Beria] fix javadoc comment 27790e4 [Beria] Avoid infinite loop in tests 5057bb3 [Beria] Incorporate feedback 1. Synchronize on AppendOutputRunner creation 2. Use ScheduledExecutorService instead of while loop 3. Remove Thread.sleep() from tests 82e9c4a [Beria] Fix comment 7020f0c [Beria] Buffer append output results + fix extra incorrect results (cherry picked from commit 11becde)


What is this PR for?
There are 2 issues and their proposed fixes:
The proposed solution for #1 is:
Solution for #2 is:
What type of PR is it?
Improvement + Bug Fix
Todos
What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1292
How should this be tested?
The test could be to run a simple paragraph with large result. Eg:
PS: One will need to clear browser cache between running with and without this code patch since there are javascript changes as well.
Screenshots (if appropriate)
Questions:
No
No
It could need for the design. Otherwise I have added code comments explaining behaviour.