Improve performance in JdbcStepExecutionDao#3635
Improve performance in JdbcStepExecutionDao#3635fmbenhassine merged 1 commit intospring-projects:masterfrom
Conversation
Optimize SQL in GET_LAST_STEP_EXECUTION to work fast on large amount of records in %PREFIX%JOB_EXECUTION table
simi
left a comment
There was a problem hiding this comment.
I have tested (just by manually tweaking the query) this change and it leads to massive performance improvement having the same result.
before (16363.873 ms)
production=# EXPLAIN ANALYZE SELECT SE.STEP_EXECUTION_ID, SE.STEP_NAME, SE.START_TIME, SE.END_TIME, SE.STATUS, SE.COMMIT_COUNT, SE.READ_COUNT, SE.FILTER_COUNT, SE.WRITE_COUNT, SE.EXIT_CODE, SE.EXIT_MESSAGE, SE.READ_SKIP_COUNT, SE.WRITE_SKIP_COUNT, SE.PROCESS_SKIP_COUNT, SE.ROLLBACK_COUNT, SE.LAST_UPDATED, SE.VERSION, JE.JOB_EXECUTION_ID, JE.START_TIME, JE.END_TIME, JE.STATUS, JE.EXIT_CODE, JE.EXIT_MESSAGE, JE.CREATE_TIME, JE.LAST_UPDATED, JE.VERSION from BATCH_JOB_EXECUTION JE, BATCH_STEP_EXECUTION SE where SE.JOB_EXECUTION_ID in (SELECT JOB_EXECUTION_ID from BATCH_JOB_EXECUTION where JE.JOB_INSTANCE_ID = 6919782) and SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID and SE.STEP_NAME = 'webhooks.destinations.batch.job.step.disable' order by SE.START_TIME desc, SE.STEP_EXECUTION_ID desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1889629.39..1892730.97 rows=1240632 width=232) (actual time=16354.241..16354.242 rows=1 loops=1)
Sort Key: se.start_time DESC, se.step_execution_id DESC
Sort Method: quicksort Memory: 25kB
-> Hash Join (cost=396812.35..1667136.03 rows=1240632 width=232) (actual time=14724.619..16354.230 rows=1 loops=1)
Hash Cond: (se.job_execution_id = je.job_execution_id)
Join Filter: (SubPlan 1)
Rows Removed by Join Filter: 2506751
-> Seq Scan on batch_step_execution se (cost=0.00..1054638.36 rows=2481263 width=173) (actual time=364.293..4884.475 rows=2506752 loops=1)
Filter: ((step_name)::text = 'webhooks.destinations.batch.job.step.disable'::text)
Rows Removed by Filter: 11481938
-> Hash (cost=222134.49..222134.49 rows=6932949 width=75) (actual time=3940.651..3940.651 rows=6921527 loops=1)
Buckets: 524288 Batches: 32 Memory Usage: 27805kB
-> Seq Scan on batch_job_execution je (cost=0.00..222134.49 rows=6932949 width=75) (actual time=0.016..1516.128 rows=6921527 loops=1)
SubPlan 1
-> Result (cost=0.00..222134.49 rows=6932949 width=8) (actual time=0.000..0.001 rows=3 loops=2506752)
One-Time Filter: (je.job_instance_id = 6919782)
-> Seq Scan on batch_job_execution (cost=0.00..222134.49 rows=6932949 width=8) (actual time=0.014..1044.411 rows=6919456 loops=1)
Planning Time: 0.553 ms
JIT:
Functions: 20
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 7.343 ms, Inlining 12.015 ms, Optimization 213.212 ms, Emission 138.594 ms, Total 371.165 ms
Execution Time: 16363.873 ms
(23 rows)
after (0.193 ms)
production=# EXPLAIN ANALYZE SELECT SE.STEP_EXECUTION_ID, SE.STEP_NAME, SE.START_TIME, SE.END_TIME, SE.STATUS, SE.COMMIT_COUNT, SE.READ_COUNT, SE.FILTER_COUNT, SE.WRITE_COUNT, SE.EXIT_CODE, SE.EXIT_MESSAGE, SE.READ_SKIP_COUNT, SE.WRITE_SKIP_COUNT, SE.PROCESS_SKIP_COUNT, SE.ROLLBACK_COUNT, SE.LAST_UPDATED, SE.VERSION, JE.JOB_EXECUTION_ID, JE.START_TIME, JE.END_TIME, JE.STATUS, JE.EXIT_CODE, JE.EXIT_MESSAGE, JE.CREATE_TIME, JE.LAST_UPDATED, JE.VERSION from BATCH_JOB_EXECUTION JE, BATCH_STEP_EXECUTION SE where SE.JOB_EXECUTION_ID in (SELECT JOB_EXECUTION_ID from BATCH_JOB_EXECUTION where JOB_INSTANCE_ID = 6919782) and SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID and SE.STEP_NAME = 'webhooks.destinations.batch.job.step.disable' order by SE.START_TIME desc, SE.STEP_EXECUTION_ID desc;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=9.97..9.97 rows=1 width=232) (actual time=0.104..0.105 rows=1 loops=1)
Sort Key: se.start_time DESC, se.step_execution_id DESC
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=1.43..9.96 rows=1 width=232) (actual time=0.096..0.098 rows=1 loops=1)
-> Nested Loop (cost=0.86..8.90 rows=1 width=75) (actual time=0.028..0.029 rows=1 loops=1)
-> Index Scan using index_batch_job_execution_on_job_instance_id on batch_job_execution (cost=0.43..4.45 rows=1 width=8) (actual time=0.016..0.017 rows=1 loops=1)
Index Cond: (job_instance_id = 6919782)
-> Index Scan using batch_job_execution_pkey on batch_job_execution je (cost=0.43..4.45 rows=1 width=67) (actual time=0.009..0.009 rows=1 loops=1)
Index Cond: (job_execution_id = batch_job_execution.job_execution_id)
-> Index Scan using josef_test_4 on batch_step_execution se (cost=0.56..0.99 rows=7 width=173) (actual time=0.066..0.066 rows=1 loops=1)
Index Cond: (((step_name)::text = 'webhooks.destinations.batch.job.step.disable'::text) AND (job_execution_id = je.job_execution_id))
Planning Time: 0.946 ms
Execution Time: 0.193 ms
(13 rows)
PS: Is there any easy way to use this change with latest release unless this will make it out? I can also test it in that way and report back.
|
@simi Thank you for taking time to test the fix and for sharing these numbers!
You can override |
|
LGTM, rebased and merged. This is a small change with a big improvement 😄 @mcheban Thank you very much for raising the issue and contributing a fix! |
Optimize SQL in GET_LAST_STEP_EXECUTION to work fast on large amount of records in %PREFIX%JOB_EXECUTION table
Fix for #3634
@pivotal-issuemaster This is an Obvious Fix