Skip to content

Comments

Implement gzip for dag bundle compression for cloud deploys#1778

Merged
pritt20 merged 3 commits intomainfrom
gzip_bundle
Jan 3, 2025
Merged

Implement gzip for dag bundle compression for cloud deploys#1778
pritt20 merged 3 commits intomainfrom
gzip_bundle

Conversation

@pritt20
Copy link
Contributor

@pritt20 pritt20 commented Jan 3, 2025

Description

This PR implements gzip compression for tar dag bundles for cloud based dag only deployments.

This is part of long term solutions that were discussed as part of nova #nova-20241227-devoid-mass where it was observed that dag deployment for large dag bundles could take up to 20-30 mins to initiate astro dag only deploy.

Links for reference:

https://astronomer.slack.com/archives/C08671SHPUP/p1735582286325009

🎟 Issue(s)

Related #XXX

🧪 Functional Testing

Tested these changes on dev deployment. Please find below details for reference:

  • Created a deployment on dev cluster. Deployment Id: cm5gr26z504ir01ocmbepukg7
  • Created Astro project with 630 Mb dags directory size.
  • Created sample tar bundle manually in order to compare the compression and could see that tar bundle turns out to be of 440 Mb
ls -ltr                                    
drwxr-xr-x pritt staff  64 B  Fri Jan  3 18:35:31 2025  plugins
drwxr-xr-x pritt staff  64 B  Fri Jan  3 18:35:31 2025  include
drwxr-xr-x pritt staff  96 B  Fri Jan  3 18:35:31 2025  tests
.rw-r--r-- pritt staff  45 B  Fri Jan  3 18:35:31 2025  Dockerfile
.rw-r--r-- pritt staff 866 B  Fri Jan  3 18:35:31 2025  airflow_settings.yaml
.rw-r--r-- pritt staff 155 B  Fri Jan  3 18:35:31 2025  requirements.txt
.rw-r--r-- pritt staff 3.3 KB Fri Jan  3 18:35:31 2025  README.md
.rw-r--r-- pritt staff   0 B  Fri Jan  3 18:35:31 2025  packages.txt
drwxr-xr-x pritt staff 192 B  Fri Jan  3 18:43:51 2025  dags
.rw-r--r-- pritt staff 440 MB Fri Jan  3 18:53:32 2025  bundle.tar

  • Pushed the dags via astro cli dag deploy functionality to above created deployment. Took around 30-40 sec.
/Users/pritt/go/src/github.com/astronomer/astro-cli/astro deploy -d
Authenticated to astronomer-dev.io 

Select a Deployment
 #     DEPLOYMENT NAME     RELEASE NAME                      DEPLOYMENT ID                 DAG DEPLOY ENABLED     
 1     neel-test-1_DND     ultraviolet-blueshift-6057        cm5cfqhag02dn01nq2zncpnn2     true                   
 2     pritt-test          mathematical-exploration-7116     cm5gr26z504ir01ocmbepukg7     true                   

> 2
Initiating DAG deploy for: cm5gr26z504ir01ocmbepukg7
Deployed DAG bundle:  2025-01-03T13:28:41.2286129Z
Deployed Image Tag:  12.6.0

Successfully uploaded DAGs with version 2025-01-03T13:28:41.2286129Z to Astro. Navigate to the Airflow UI to confirm that your deploy was successful. The Airflow UI takes about 1 minute to update.

 Access your Deployment:

 Deployment View: cloud.astronomer-dev.io/clm7hapoz009a01ok5jn619kt/deployments/cm5gr26z504ir01ocmbepukg7/overview
 Airflow UI: neel-test-hosted.astronomer-dev.run/dbepukg7?orgId=org_bQTPWyCrWrv7QA4M

  • Validated the bundle size in azure storage account and could see that bundle was compressed successfully and size of bundle was reduced to 2.3Mb

image

  • Validated that dags are deployed successfully.

image

📋 Checklist

  • Rebased from the main (or release if patching) branch (before testing)
  • Ran make test before taking out of draft
  • Ran make lint before taking out of draft
  • Added/updated applicable tests
  • Tested against Astro-API (if necessary).
  • Tested against Houston-API and Astronomer (if necessary).
  • Communicated to/tagged owners of respective clients potentially impacted by these changes.
  • Updated any related documentation

Copy link
Contributor

@neel-astro neel-astro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍, this would be a great improvement, left a minor nit to squash

@pritt20 pritt20 requested a review from schnie January 3, 2025 15:04
@astronomer astronomer deleted a comment from pritt20 Jan 3, 2025
@pritt20 pritt20 merged commit 58fca47 into main Jan 3, 2025
@pritt20 pritt20 deleted the gzip_bundle branch January 3, 2025 18:47
neel-astro pushed a commit that referenced this pull request Jan 8, 2025
neel-astro pushed a commit that referenced this pull request Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants