-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Add --overwrite option to connections import CLI command
#28738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --overwrite option to connections import CLI command
#28738
Conversation
e5a8b2a to
aa00851
Compare
aa00851 to
daeae0f
Compare
connections import CLI
connections import CLIconnections import CLI command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the overwrite=True case I wonder if it’s a good idea to update the existing row instead of a delete-insert combination. Maybe even upsert on supported database backends?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair. I tried to get updates working, but ran into IntegrityErrors a bunch due to how flushes work. I fell back on delete/flush/create, but I'll take another look today to see if I can get update working.
Thanks for the review!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case this helps any future readers (or me 😁), the IntegrityErrors were due to the fact that I thought conn_id was the PK for connections. When I merged a new Connection into the session to overwrite an existing one with the same conn_id, the session wouldn't reconcile the existing and new entries.
While there is a unique constraint on conn_id, the PK is id. The two entries having different PKs with the same conn_id is what caused the IntegrityError.
I fixed that by setting conn.id = existing_conn.id and have upserts working.
40b14fa to
a635962
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overwrite part is technically unnecessary; if there is an existing_conn, overwrite is always True.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could even refactor this entire block into
existing_id = session.query(Connection.id).filter(Connection.conn_id == conn_id).first()
if existing_id is not None:
if not overwrite:
continue
conn.id = existing_id
session.merge(conn)(I omitted some code not related to the flow control.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I'll refactor that.
a635962 to
965c239
Compare
965c239 to
ab0ae2a
Compare
| if session.query(Connection).filter(Connection.conn_id == conn_id).first(): | ||
| print(f"Could not import connection {conn_id}: connection already exists.") | ||
| continue | ||
| existing_conn_id = session.query(Connection.id).filter(Connection.conn_id == conn_id).scalar() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorporated the feedback above with one tweak:
Using .first() yields a tuple representing the select clause, like (1,).
Using .scalar() gets the first row and returns the first item in the tuple (or None, if no rows are returned).
potiuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice and helpful for important use cases. Good idea @natanweinberger !
The CLI command
airflow connections import <filepath>cannot overwrite existing connections. Conflicts are skipped with the messageCould not import connection <connection_id>: connection already exists.This PR adds functionality to set
--overwrite=trueto overwrite existing connections when there is a conflict.The existing behavior (no overwrites) is still the default behavior, the user must set
--overwrite=trueto enable overwrites.Original idea for this: #15177 (comment)