Discuss data retention strategy for CI data
As @drew
pointed out in gitlab#407821 (comment 1365517761) in discussing our current data retention strategy for Artifacts and Jobs, I thought we could discuss more broadly whether we need to revise our existing data retention strategy.
Some of the expected benefits:
- Lowered storage costs by not needing to store older builds and their artifacts, and other related data (e.g. builds metadata)
- Improved reliability when performing database operations against large tables
What needs to be considered / evaluated:
- Pruning of large tables could be a challenge and have an impact on SaaS availability; however, this may be worth exploring now that CI tables are and will be partitioned into smaller tables
- Legal implication of pruning large tables
- Who would be impacted (e.g. freemium vs ultimate users)
- How would a revised strategy be communicated and how much notice are customers provided
Edited by Cheryl Li