Webserver does not exit upon gunicorn master crash

**Apache Airflow version**: 1.10.14 and 2.0.0

**Kubernetes version (if you are using kubernetes)** (use `kubectl version`): N/A

**Environment**:

- **Cloud provider or hardware configuration**: AWS, custom Docker image based on Debian buster
- **OS** (e.g. from /etc/os-release):
- **Kernel** (e.g. `uname -a`): Linux 9b57b2a952e3 4.14.193-149.317.amzn2.x86_64
- **Install tools**:
- **Others**:

**What happened**:

When gunicorn master process dies with all the workers, webserver fails to exit - instead it keeps logging the following message about every 10 seconds:

```
webserver_1  | [2021-01-04 22:32:47 +0000] [31] [INFO] Handling signal: ttou
webserver_1  | [2021-01-04 22:32:47 +0000] [82] [INFO] Worker exiting (pid: 82)
webserver_1  | [2021-01-04 22:32:57 +0000] [31] [INFO] Handling signal: term
webserver_1  | [2021-01-04 22:32:57 +0000] [95] [INFO] Worker exiting (pid: 95)
webserver_1  | [2021-01-04 22:32:57 +0000] [116] [INFO] Worker exiting (pid: 116)
webserver_1  | [2021-01-04 22:32:58 +0000] [31] [INFO] Shutting down: Master
webserver_1  | [2021-01-04 22:32:58,228] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:33:09,239] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:33:20,252] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:33:31,263] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:33:42,275] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:33:53,288] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:34:04,301] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:34:15,313] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:34:26,320] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:34:37,332] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:34:48,344] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:34:59,357] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:35:10,367] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:35:21,379] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:35:32,392] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:35:43,404] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
webserver_1  | [2021-01-04 22:35:54,414] {cli.py:1082} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
```

In the example above, I've killed the gunicorn master process intentionally: `kill 31`. In the real-world scenarios I've observed, the master process and the workers would crash due to some transient issue, such as temporary failure to fetch a secret.

**What you expected to happen**:

Airflow webserver should exit, so it can be restarted via systemd, docker deamon, or whatever else is managing the running services.


**How to reproduce it**:

Send KILL signal to the gunicorn master process.

**Anything else we need to know**:

I'll be providing a PR shortly with a fix for this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Webserver does not exit upon gunicorn master crash #13469

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Webserver does not exit upon gunicorn master crash #13469

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions