Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
The Spark scheduler is very deterministic, which causes problems for the following workload (in serial order on a cluster with a small number of nodes):
cache rdd 1 with 1 partition
cache rdd 2 with 1 partition
cache rdd 3 with 1 partition
....
After a while, only executor 1 will have data in memory, and eventually leading to evicting in-memory blocks to disk while all other executors are empty.
We can solve this problem by adding some randomization to the cluster scheduling, or by adding memory aware scheduling (which is much harder to do).
Attachments
Issue Links
- links to