Commit e355460
[SPARK-15892][ML] Incorrectly merged AFTAggregator with zero total count
## What changes were proposed in this pull request?
Currently, `AFTAggregator` is not being merged correctly. For example, if there is any single empty partition in the data, this creates an `AFTAggregator` with zero total count which causes the exception below:
```
IllegalArgumentException: u'requirement failed: The number of instances should be greater than 0.0, but got 0.'
```
Please see [AFTSurvivalRegression.scala#L573-L575](https://github.com/apache/spark/blob/6ecedf39b44c9acd58cdddf1a31cf11e8e24428c/mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala#L573-L575) as well.
Just to be clear, the python example `aft_survival_regression.py` seems using 5 rows. So, if there exist partitions more than 5, it throws the exception above since it contains empty partitions which results in an incorrectly merged `AFTAggregator`.
Executing `bin/spark-submit examples/src/main/python/ml/aft_survival_regression.py` on a machine with CPUs more than 5 is being failed because it creates tasks with some empty partitions with defualt configurations (AFAIK, it sets the parallelism level to the number of CPU cores).
## How was this patch tested?
An unit test in `AFTSurvivalRegressionSuite.scala` and manually tested by `bin/spark-submit examples/src/main/python/ml/aft_survival_regression.py`.
Author: hyukjinkwon <[email protected]>
Author: Hyukjin Kwon <[email protected]>
Closes #13619 from HyukjinKwon/SPARK-15892.1 parent 0ff8a68 commit e355460
File tree
2 files changed
+13
-1
lines changed- mllib/src
- main/scala/org/apache/spark/ml/regression
- test/scala/org/apache/spark/ml/regression
2 files changed
+13
-1
lines changedLines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
538 | 538 | | |
539 | 539 | | |
540 | 540 | | |
541 | | - | |
| 541 | + | |
542 | 542 | | |
543 | 543 | | |
544 | 544 | | |
| |||
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
390 | 390 | | |
391 | 391 | | |
392 | 392 | | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
393 | 405 | | |
394 | 406 | | |
395 | 407 | | |
| |||
0 commit comments