-
-
Notifications
You must be signed in to change notification settings - Fork 878
Description
What challenge are you facing?
Today, pipeline resource versions are collected in to a versioned_resources table. This table was predated the "life" epic (#629). It contains the following schema:
atc=# \d versioned_resources
Table "public.versioned_resources"
Column | Type | Collation | Nullable | Default
-------------+---------+-----------+----------+-------------------------------------------------
id | integer | | not null | nextval('versioned_resources_id_seq'::regclass)
version | text | | not null |
metadata | text | | not null |
type | text | | not null |
enabled | boolean | | not null | true
resource_id | integer | | |
check_order | integer | | not null | 0
It points to resource_id, making this per-pipeline-resource. This means that multiple pipelines with the same resource configs will be redundantly collecting the same version/metadata information.
A Modest Proposal
To be honest, this isn't a huge deal right now, aside from wasted database space and redundant checking. However if we make the relationship between a pipeline's resources and the abstract version history a bit tighter, there are actually a few benefits:
- We can reduce the amount of
checking required across pipelines for equivalent resources. - We can reduce the amount of data recorded for equivalent resources.
- When a user changes their pipeline resource's configuration, the history will be "re-set" (ref. Support for purging version history of a resource. #145) to the new config, and should always be correct.
- There may be some as-yet-unknown improvements we can make to the database model by having a cleaner representation.
- As part of RFC: Resources v2 rfcs#1, we're going to start collecting all versions, not just starting from pipeline configuration time. There'll be a lot more data to record, so sharing it between duplicated resources will make things a lot more efficient.
Implementation Notes
Enabling/Disabling versions
Enabling/disabling versions should remain scoped to pipeline resources, obviously. This can be done via a join table (pipeline_resource_config_versions or some such).
Distinct check intervals
Now that we only check once per resource config, there's a little gotcha. Different pipelines can have varying check_every settings.
Here's one idea: record last_checked on the resource config, and have each pipeline's radar component just check if the last_checked is >= their interval. So, we'll check at the fastest defined frequency. Pipelines with longer check_intervals will have versions show up more quickly than expected, but that really shouldn't matter.
Pausing pipeline resources
Currently, users pause pipeline resources with the intended effect that no new versions are collected and used for later builds. This is really awkward when other pipelines result in checking the config anyway.
We could still support today's behavior by "faking it" and having pausing a resource really just 'pin' it to whatever the version was at the time. But actually, that sounds a lot like #1288. Maybe we should just implement that instead, and remove the resource pausing functionality?