Skip to content

Create CLI command to inspect and troubleshoot WireGuard network #64

@psviderski

Description

@psviderski

Adding a new machine to the cluster may fail due to Corrosion not being able to establish a connection to other peers. The main reason for this is usually WireGuard not being able to establish a tunnel between machines.

...
✓ Uncloud machine daemon started.
✓ Uncloud installed on the machine successfully! 🎉
Machine 'uc-prod-ap1' added to the cluster (context 'uc-prod').
Waiting for the machine to be ready...
Error: wait for cluster to be initialised on machine: rpc error: code = FailedPrecondition desc = cluster is not initialised

uncloud-corrosion service logs on uc-prod-ap1:

May 23 04:06:29 vps1 uncloud-corrosion[1016597]: 2025-05-23T04:06:29.029051Z ERROR corro_agent::transport: error=deadline has elapsed
May 23 04:06:29 vps1 uncloud-corrosion[1016597]: 2025-05-23T04:06:29.029077Z ERROR corro_agent::transport: error=deadline has elapsed
May 23 04:06:29 vps1 uncloud-corrosion[1016597]: 2025-05-23T04:06:29.029081Z ERROR corro_agent::agent::handlers: could not write datagram [fdcc:259e:b5d0:d11b:b9d:27f:fdc3:3a17]:51001:>
May 23 04:06:34 vps1 uncloud-corrosion[1016597]: 2025-05-23T04:06:34.030117Z ERROR corro_agent::transport: error=deadline has elapsed
May 23 04:06:34 vps1 uncloud-corrosion[1016597]: 2025-05-23T04:06:34.030147Z ERROR corro_agent::transport: error=deadline has elapsed
May 23 04:06:34 vps1 uncloud-corrosion[1016597]: 2025-05-23T04:06:34.030151Z ERROR corro_agent::agent::handlers: could not write datagram [fdcc:259e:b5d0:d11b:b9d:27f:fdc3:3a17]:51001:>
May 23 04:06:46 vps1 uncloud-corrosion[1016597]: 2025-05-23T04:06:46.131217Z ERROR corro_agent::transport: error=deadline has elapsed
May 23 04:06:46 vps1 uncloud-corrosion[1016597]: 2025-05-23T04:06:46.131257Z ERROR corro_agent::transport: error=deadline has elapsed

We need to provide tools/commands to help troubleshoot such issues. For example, running wg show on one of the machines could be useful to inspect the WireGuard tunnels and their status. Usually wg util needs to be installed manually.

I'm thinking of a command that could output useful information about the WG tunnels, e.g. uc machine inspect that prints the machine metadata and a table with information about the tunnels.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions