Skip to content
This repository was archived by the owner on Jan 30, 2020. It is now read-only.
This repository was archived by the owner on Jan 30, 2020. It is now read-only.

Race condition in unloading units #1216

@patrickbcullen

Description

@patrickbcullen

I have noticed a race condition in production where job is supposed to be unloaded, but never stops.

I think I found the bug here

a.registry.ClearUnitHeartbeat(unitName)
.

I have an example where the systemd file is gone, but the unit is still running. This happened because the kill command in the systemd file did not succeed. Since this code does not check for error it happily removes all the systemd files. Now that the systemd files are gone I cannot stop it through fleet since fleet cannot call the systemd stop because it already deleted the systemd file.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions