[core] refactor AbstractTrainer #4804

canerturkmen · 2025-01-16T15:12:13Z

Issue #, if available:

Description of changes:

This is the first PR of a few for a major refactor of AbstractTrainer unifying the interfaces of AbstractTimeSeriesTrainer and tabular's Trainer class into one abstract class in core.

This PR factors out AbstractTrainer in core, including model management behavior that is common in time series and tabular, as well as common interfaces.

Subsequent PRs will

Move AbstractTrainer in core to tabular, since it is now almost exclusively tabular specific concretizations.
Remove SimpleAbstractTrainer from timeseries, and move it to core.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

github-actions · 2025-01-17T15:42:36Z

Job PR-4804-f5e9042 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4804/f5e9042/index.html

github-actions · 2025-01-17T20:24:57Z

Job PR-4804-9eecbd8 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4804/9eecbd8/index.html

Innixma · 2025-01-17T21:26:35Z

core/src/autogluon/core/trainer/abstract_trainer.py

+from autogluon.core.augmentation.distill_utils import augment_data, format_distillation_labels
+from autogluon.core.calibrate import calibrate_decision_threshold
+from autogluon.core.calibrate.conformity_score import compute_conformity_score
+from autogluon.core.calibrate.temperature_scaling import apply_temperature_scaling, tune_temperature_scaling
+from autogluon.core.callbacks import AbstractCallback
+from autogluon.core.constants import BINARY, MULTICLASS, QUANTILE, REFIT_FULL_NAME, REGRESSION, SOFTCLASS
+from autogluon.core.data.label_cleaner import LabelCleanerMulticlassToBinary
+from autogluon.core.metrics import Scorer, compute_metric, get_metric
+from autogluon.core.models import (
+    AbstractModel,
+    BaggedEnsembleModel,
+    GreedyWeightedEnsembleModel,
+    SimpleWeightedEnsembleModel,
+    StackerEnsembleModel,
+    WeightedEnsembleModel,
+)
+from autogluon.core.pseudolabeling.pseudolabeling import assert_pseudo_column_match
+from autogluon.core.ray.distributed_jobs_managers import ParallelFitManager
+from autogluon.core.utils import (


Why switch from relative to absolute imports? Tabular, core, common, and features all use relative imports.

Is there a general guideline you are following for favoring absolute over relative?

Nothing particular in this case, it's just that these will move into tabular anyway. I can switch to relative imports for the ones that remain after that.

PEP8 recommends absolute imports but I would vote in favor of consistency with other modules here.

Innixma

Very nice! Added some minor comments but overall this is a great step in the right direction for unifying the API of trainer.

core/src/autogluon/core/trainer/abstract_trainer.py

Innixma · 2025-01-17T21:41:59Z

core/src/autogluon/core/trainer/abstract_trainer.py

+        if not isinstance(model, str):
+            model = model.name


This logic and variants of it can probably be put into utility methods since they are so common. Maybe as a fast follow PR. Will change a lot of 2-4 line logic into 1 line, and improve type hinting in the IDE so we can be more explicit about the type of the variable in the code.

Innixma · 2025-01-17T21:44:44Z

core/src/autogluon/core/trainer/abstract_trainer.py

+        if not self.low_memory:
+            self.models[model.name] = model


The low_memory logic is something I implemented a very long time ago but at a certain point gave up on testing/using, since users didn't really care too much that files were being saved on disk, and it was a hassle trying to get everything working in-memory.

If time-series isn't using it, we may consider removing the low_memory logic entirely in a follow-up PR, it would make the code simpler and avoid us having to juggle multiple input/output variable types depending on the low_memory setting.

agreed. 👍

Innixma · 2025-01-17T21:48:46Z

core/src/autogluon/core/trainer/abstract_trainer.py

+    @property
+    def path_root(self) -> str:
+        """directory containing learner.pkl"""
+        return os.path.dirname(self.path)


I remember wanting to refactor some of this logic as currently we have self.path and self.path_root which is a bit confusing. I might look into it closer as a follow-up.

The overall problem is that self.path doesn't contain all of the artifacts needed by the Trainer, such as those from path_utils. In an ideal world having it be self contained would be nice, kind of how Predictor is self-contained. But I would need to think about it more, because I also thought about a world where we could have multiple trainers for a single predictor, in which case it would be good for trainers to re-use certain artifacts between them.

core/src/autogluon/core/trainer/abstract_trainer.py

core/src/autogluon/core/trainer/__init__.py

timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py

github-actions · 2025-01-18T21:19:54Z

Job PR-4804-1fa84e4 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4804/1fa84e4/index.html

shchur

Thank you for this herculean effort! Only a several questions and minor comments

shchur · 2025-01-20T10:18:51Z

core/src/autogluon/core/trainer/abstract_trainer.py

+from autogluon.core.augmentation.distill_utils import augment_data, format_distillation_labels
+from autogluon.core.calibrate import calibrate_decision_threshold
+from autogluon.core.calibrate.conformity_score import compute_conformity_score
+from autogluon.core.calibrate.temperature_scaling import apply_temperature_scaling, tune_temperature_scaling
+from autogluon.core.callbacks import AbstractCallback
+from autogluon.core.constants import BINARY, MULTICLASS, QUANTILE, REFIT_FULL_NAME, REGRESSION, SOFTCLASS
+from autogluon.core.data.label_cleaner import LabelCleanerMulticlassToBinary
+from autogluon.core.metrics import Scorer, compute_metric, get_metric
+from autogluon.core.models import (
+    AbstractModel,
+    BaggedEnsembleModel,
+    GreedyWeightedEnsembleModel,
+    SimpleWeightedEnsembleModel,
+    StackerEnsembleModel,
+    WeightedEnsembleModel,
+)
+from autogluon.core.pseudolabeling.pseudolabeling import assert_pseudo_column_match
+from autogluon.core.ray.distributed_jobs_managers import ParallelFitManager
+from autogluon.core.utils import (


PEP8 recommends absolute imports but I would vote in favor of consistency with other modules here.

shchur · 2025-01-20T10:24:22Z

core/src/autogluon/core/trainer/abstract_trainer.py

+            model = model.name
+        self.model_graph.nodes[model][attribute] = val
+
+    def get_minimum_model_set(self, model, include_self=True) -> list:


Nit: Some methods are missing type hints.

not a nit at all and thanks for pointing it out.

a heads up, one thing that will come up in the next PR is how we handle this AbstractModel type being passed around. Because TimeSeriesTrainer will like to only work with AbstractTimeSeriesModel, however if we constrain it trivially in the function signature then pyright will complain LSP is violated. We will then have to revisit this class to use generics...

class AbstractTrainer(Generic[T]): def work_with_model(model: T): ... class AbstractTimeSeriesTrainer(AbstractTrainer[AbstractTimeSeriesModel]): ...

note the syntax gets much lighter for working with generics starting from 3.11.

I was really hoping that by specializing in Python instead of C++ I will be able to avoid learning what a generic is... welp

timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py

core/src/autogluon/core/trainer/abstract_trainer.py

timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py

github-actions · 2025-01-20T18:26:34Z

Job PR-4804-7842f9b is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4804/7842f9b/index.html

canerturkmen added module: tabular code cleanup Fixing warnings/deprecations/syntax/etc. module: timeseries related to the timeseries module labels Jan 16, 2025

canerturkmen force-pushed the refactor-abstract-trainer branch from 1340302 to f5e9042 Compare January 17, 2025 12:56

canerturkmen added the module: core label Jan 17, 2025

canerturkmen force-pushed the refactor-abstract-trainer branch from f5e9042 to 8320420 Compare January 17, 2025 16:19

canerturkmen marked this pull request as ready for review January 17, 2025 16:47

canerturkmen requested review from Innixma and shchur and removed request for Innixma January 17, 2025 16:47

Innixma reviewed Jan 17, 2025

View reviewed changes

Innixma approved these changes Jan 17, 2025

View reviewed changes

Innixma added this to the 1.3 Release milestone Jan 17, 2025

shchur approved these changes Jan 20, 2025

View reviewed changes

canerturkmen added 5 commits January 20, 2025 15:04

[core] refactor AbstractTrainer

2ebce6b

fix pyright in SimpleAbstractTrainer

2552286

re-introduce constructor to SimpleAbstractTrainer

4a03486

address PR comments

f3ba6fd

address PR comments

7842f9b

canerturkmen force-pushed the refactor-abstract-trainer branch from 1fa84e4 to 7842f9b Compare January 20, 2025 15:33

canerturkmen merged commit 4260eac into autogluon:master Jan 20, 2025
26 checks passed

canerturkmen deleted the refactor-abstract-trainer branch January 20, 2025 18:19

[core] refactor AbstractTrainer #4804

[core] refactor AbstractTrainer #4804

Uh oh!

Conversation

canerturkmen commented Jan 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 17, 2025

Uh oh!

github-actions bot commented Jan 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Innixma left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jan 18, 2025

Uh oh!

shchur left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jan 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

canerturkmen commented Jan 16, 2025 •

edited

Loading