Improve time series filtering based on cutoff, horizon and min_context_length #18
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue #, if available:
This PR makes it easier to work with datasets where some datasets are too short for the specific
Taskconfiguration.Before this PR
Previously, we had the argument
min_ts_length(defaults tohorizon + 1) to deal with such datasets. The filtering logic was as follows:min_ts_lengthobservationscutoff. If the part beforecutoffhas < 1 observations OR if the part aftercutoffhas <horizonobservations, raise an exception.For example, if some time series are really long, but actually have no observations before the
cutoff, we will run into errors. It's not trivial to filter them out by settingmin_ts_length, especially if different time series have different lengths in the dataset.This PR
We replace the
min_ts_lengthargument withmin_context_length(defaults to 1).We change the filtering logic to remove time series if:
min_context_lengthobservations beforecutoffhorizonobservations aftercutoffThese changes are 100% backwards compatible with the old behavior, but now make it much easier to work with datasets where time series have wildly different lengths / cover different time periods. Specifically:
Other changes
0.5.0rc1for the pre-releaseBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.