Skip to content

Random "duplicate key value violates unique constraint" errors when initializing the postgres database #23512

@zachliu

Description

@zachliu

Apache Airflow version

2.3.0 (latest released)

What happened

while testing airflow 2.3.0 locally (using postgresql 12.4), the webserver container shows random errors:

webserver_1  | + airflow db init
...
webserver_1  | + exec airflow webserver
...
webserver_1  | [2022-05-04 18:58:46,011] {{manager.py:568}} INFO - Added Permission menu access on Permissions to role Admin
postgres_1   | 2022-05-04 18:58:46.013 UTC [41] ERROR:  duplicate key value violates unique constraint "ab_permission_view_role_permission_view_id_role_id_key"
postgres_1   | 2022-05-04 18:58:46.013 UTC [41] DETAIL:  Key (permission_view_id, role_id)=(204, 1) already exists.
postgres_1   | 2022-05-04 18:58:46.013 UTC [41] STATEMENT:  INSERT INTO ab_permission_view_role (id, permission_view_id, role_id) VALUES (nextval('ab_permission_view_role_id_seq'), 204, 1) RETURNING ab_permission_view_role.id
webserver_1  | [2022-05-04 18:58:46,015] {{manager.py:570}} ERROR - Add Permission to Role Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "ab_permission_view_role_permission_view_id_role_id_key"
webserver_1  | DETAIL:  Key (permission_view_id, role_id)=(204, 1) already exists.
webserver_1  |
webserver_1  | [SQL: INSERT INTO ab_permission_view_role (id, permission_view_id, role_id) VALUES (nextval('ab_permission_view_role_id_seq'), %(permission_view_id)s, %(role_id)s) RETURNING ab_permission_view_role.id]
webserver_1  | [parameters: {'permission_view_id': 204, 'role_id': 1}]

notes:

  1. when the db is first initialized, i have ~40 errors like this (with ~40 different permission_view_id but always the same 'role_id': 1)
  2. when it's not the first time initializing db, i always have 1 error like this but it shows different permission_view_id each time
  3. all these errors don't seem to have any real negative effects, the webserver is still running and airflow is still running and scheduling tasks
  4. "occasionally" i do get real exceptions which render the webserver workers all dead:
postgres_1   | 2022-05-05 20:03:30.580 UTC [44] ERROR:  duplicate key value violates unique constraint "ab_permission_view_role_permission_view_id_role_id_key"
postgres_1   | 2022-05-05 20:03:30.580 UTC [44] DETAIL:  Key (permission_view_id, role_id)=(214, 1) already exists.
postgres_1   | 2022-05-05 20:03:30.580 UTC [44] STATEMENT:  INSERT INTO ab_permission_view_role (id, permission_view_id, role_id) VALUES (nextval('ab_permission_view_role_id_seq'), 214, 1) RETURNING ab_permission_view_role.id
webserver_1  | [2022-05-05 20:03:30 +0000] [121] [ERROR] Exception in worker process
webserver_1  | Traceback (most recent call last):
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
webserver_1  |     self.dialect.do_execute(
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
webserver_1  |     cursor.execute(statement, parameters)
webserver_1  | psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "ab_permission_view_role_permission_view_id_role_id_key"
webserver_1  | DETAIL:  Key (permission_view_id, role_id)=(214, 1) already exists.
webserver_1  |
webserver_1  |
webserver_1  | The above exception was the direct cause of the following exception:
webserver_1  |
webserver_1  | Traceback (most recent call last):
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/gunicorn/arbiter.py", line 589, in spawn_worker
webserver_1  |     worker.init_process()
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/gunicorn/workers/base.py", line 134, in init_process
webserver_1  |     self.load_wsgi()
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
webserver_1  |     self.wsgi = self.app.wsgi()
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/gunicorn/app/base.py", line 67, in wsgi
webserver_1  |     self.callable = self.load()
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
webserver_1  |     return self.load_wsgiapp()
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
webserver_1  |     return util.import_app(self.app_uri)
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/gunicorn/util.py", line 412, in import_app
webserver_1  |     app = app(*args, **kwargs)
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/airflow/www/app.py", line 158, in cached_app
webserver_1  |     app = create_app(config=config, testing=testing)
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/airflow/www/app.py", line 146, in create_app
webserver_1  |     sync_appbuilder_roles(flask_app)
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/airflow/www/app.py", line 68, in sync_appbuilder_roles
webserver_1  |     flask_app.appbuilder.sm.sync_roles()
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/airflow/www/security.py", line 580, in sync_roles
webserver_1  |     self.update_admin_permission()
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/airflow/www/security.py", line 562, in update_admin_permission
webserver_1  |     self.get_session.commit()
webserver_1  |   File "<string>", line 2, in commit
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 1423, in commit
webserver_1  |     self._transaction.commit(_to_root=self.future)
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 829, in commit
webserver_1  |     self._prepare_impl()
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
webserver_1  |     self.session.flush()
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 3255, in flush
webserver_1  |     self._flush(objects)
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 3395, in _flush
webserver_1  |     transaction.rollback(_capture_exception=True)
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
webserver_1  |     compat.raise_(
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
webserver_1  |     raise exception
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 3355, in _flush
webserver_1  |     flush_context.execute()
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 453, in execute
webserver_1  |     rec.execute(self)
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 576, in execute
webserver_1  |     self.dependency_processor.process_saves(uow, states)
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/dependency.py", line 1182, in process_saves
webserver_1  |     self._run_crud(
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/dependency.py", line 1245, in _run_crud
webserver_1  |     connection.execute(statement, secondary_insert)
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1200, in execute
webserver_1  |     return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 313, in _execute_on_connection
webserver_1  |     return connection._execute_clauseelement(
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1389, in _execute_clauseelement
webserver_1  |     ret = self._execute_context(
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1748, in _execute_context
webserver_1  |     self._handle_dbapi_exception(
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1929, in _handle_dbapi_exception
webserver_1  |     util.raise_(
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
webserver_1  |     raise exception
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
webserver_1  |     self.dialect.do_execute(
webserver_1  |   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
webserver_1  |     cursor.execute(statement, parameters)
webserver_1  | sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "ab_permission_view_role_permission_view_id_role_id_key"
webserver_1  | DETAIL:  Key (permission_view_id, role_id)=(214, 1) already exists.
webserver_1  |
webserver_1  | [SQL: INSERT INTO ab_permission_view_role (id, permission_view_id, role_id) VALUES (nextval('ab_permission_view_role_id_seq'), %(permission_view_id)s, %(role_id)s) RETURNING ab_permission_view_role.id]
webserver_1  | [parameters: {'permission_view_id': 214, 'role_id': 1}]
webserver_1  | (Background on this error at: http://sqlalche.me/e/14/gkpj)
webserver_1  | [2022-05-05 20:03:30 +0000] [121] [INFO] Worker exiting (pid: 121)
flower_1     | + exec airflow celery flower
scheduler_1  | + exec airflow scheduler
webserver_1  | [2022-05-05 20:03:31 +0000] [118] [INFO] Worker exiting (pid: 118)
webserver_1  | [2022-05-05 20:03:31 +0000] [119] [INFO] Worker exiting (pid: 119)
webserver_1  | [2022-05-05 20:03:31 +0000] [120] [INFO] Worker exiting (pid: 120)
worker_1     | + exec airflow celery worker

However such exceptions are rare and pure random, i can't find a way to reproduce them consistently.

What you think should happen instead

prior to 2.3.0 there were no such errors

How to reproduce

No response

Operating System

Linux Mint 20.3

Versions of Apache Airflow Providers

No response

Deployment

Docker-Compose

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    affected_version:2.3Issues Reported for 2.3area:MetaDBMeta Database related issues.area:corearea:upgradeFacilitating migration to a newer version of Airflowkind:bugThis is a clearly a bug

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions