Skip to content

Conversation

@saurabhd336
Copy link
Contributor

@saurabhd336 saurabhd336 commented Jul 12, 2022

Save task generator info into an inmemory map, and provide a controller API to fetch the details for a given table
design doc: https://docs.google.com/document/d/1bCn13k57WLSaIAQGfATP75VS3p6V_ievlBi_HoOtmZ8/edit?usp=sharing

@saurabhd336 saurabhd336 changed the title Task generator meta api Task generator meta api + save task generator info to ZK Jul 12, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jul 12, 2022

Codecov Report

Merging #9043 (acb494a) into master (9720d55) will decrease coverage by 41.40%.
The diff coverage is 44.76%.

❗ Current head acb494a differs from pull request most recent head a66a59e. Consider uploading reports for the commit a66a59e to get more accurate results

@@              Coverage Diff              @@
##             master    #9043       +/-   ##
=============================================
- Coverage     70.08%   28.67%   -41.41%     
+ Complexity     4740       51     -4689     
=============================================
  Files          1831     1823        -8     
  Lines         96293    96040      -253     
  Branches      14392    14366       -26     
=============================================
- Hits          67484    27538    -39946     
- Misses        24146    65932    +41786     
+ Partials       4663     2570     -2093     
Flag Coverage Δ
integration1 26.58% <44.76%> (+0.05%) ⬆️
integration2 24.68% <41.90%> (-0.15%) ⬇️
unittests1 ?
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...he/pinot/common/utils/helix/FakePropertyStore.java 0.00% <0.00%> (-56.53%) ⬇️
...roller/api/resources/PinotTaskRestletResource.java 3.87% <0.00%> (-0.11%) ⬇️
...che/pinot/common/minion/BaseTaskGeneratorInfo.java 20.00% <20.00%> (ø)
...controller/helix/core/minion/PinotTaskManager.java 33.52% <37.93%> (-33.77%) ⬇️
.../common/minion/TaskGeneratorMostRecentRunInfo.java 45.71% <45.71%> (ø)
.../common/minion/InMemoryTaskManagerStatusCache.java 65.21% <65.21%> (ø)
...he/pinot/common/minion/TaskManagerStatusCache.java 100.00% <100.00%> (ø)
...apache/pinot/controller/BaseControllerStarter.java 77.95% <100.00%> (-4.81%) ⬇️
.../java/org/apache/pinot/spi/utils/BooleanUtils.java 0.00% <0.00%> (-100.00%) ⬇️
...java/org/apache/pinot/spi/trace/BaseRecording.java 0.00% <0.00%> (-100.00%) ⬇️
... and 1264 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9720d55...a66a59e. Read the comment docs.

Copy link
Contributor

@mcvsubbu mcvsubbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us clear PR 9011 first

@npawar
Copy link
Contributor

npawar commented Jul 12, 2022

Let us clear PR 9011 first

we might have to continue the discussion here instead, because the dev doing 9011 is on a long leave.

@mcvsubbu
Copy link
Contributor

Let us clear PR 9011 first

we might have to continue the discussion here instead, because the dev doing 9011 is on a long leave.

OK, then let us close the other PR, and continue the discussion here.

@mcvsubbu
Copy link
Contributor

I am not for using ZK as a log sink. There are metrics, there are logs for each individual service, there are also ways of getting exceptions like we do on the server (save the last N exceptions and return with the status).

Copy link
Contributor

@mcvsubbu mcvsubbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I mentioned, let us not add new Znodes just to indicate logs.

@npawar
Copy link
Contributor

npawar commented Jul 13, 2022

I am not for using ZK as a log sink. There are metrics, there are logs for each individual service, there are also ways of getting exceptions like we do on the server (save the last N exceptions and return with the status).

Yea keeping in memory (like consumer errors) sounds like a fine first approach. I'm also in favor of starting off with that, and moving to zk if need for persistence arises based on user feedback (or exploring the more elaborate solutions at that time)

@saurabhd336 saurabhd336 changed the title Task generator meta api + save task generator info to ZK Task generator debug api Jul 13, 2022
@saurabhd336
Copy link
Contributor Author

I've updated the PR to reflect the discussion. Mainly moved to an inmemory map for storing the debug data. If need be, we can easily move to ZK based storage by writing another implementation of the TaskManagerStatusCache interface

@mcvsubbu
Copy link
Contributor

In general, use alerts and logs to debug problems. If you really want to persist problem state over time, the approach I prefer is to sprinkle code all over pinot (similar to metrics) that emit some events to a stream underneath. This stream can be fielded by Pinot into a pinot table, and used to debug the pinot cluster via queries: "How many minion jobs failed last week vs previous week?", "Which table has the most errors loading a segment?" etc.

@npawar, @kishoreg and I have discussed before about introducing new znodes for new features (let alone debugging), and came to the conclusion that it is hard to clean up things afterwards.

Anyways, since we are concluding to go with keeping stuff in memory and using APIs to debug, I suggest you put out an issue and add some interface definitions there (or a doc if you choose).

Please close thsis PR and begin a new one

@saurabhd336
Copy link
Contributor Author

Updated PR description with design doc detailing the goals and design for this feature

@saurabhd336 saurabhd336 force-pushed the taskGenratorMetaApi branch from cb61f54 to a66a59e Compare July 14, 2022 11:55
@mcvsubbu
Copy link
Contributor

@saurabhd336 I cannot add comments to your design doc. Also, please close this PR and add your design doc to the issue (create one if there is none).

Thanks

If you are not modifying zk this may be simple enough that you don't need a design doc, but since you already have one, that is fine.

@saurabhd336
Copy link
Contributor Author

I've raised a new PR #9058.
Doc link has been fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants