Skip to content

[Bug] [WorkerClusters] Worker group change handler should set all worker addresses. #17408

@reele

Description

@reele

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

    @Override
    public void onWorkerGroupChange(List<WorkerGroup> workerGroups) {
        for (WorkerGroup workerGroup : workerGroups) {
            List<String> activeWorkers = WorkerGroupUtils.getWorkerAddressListFromWorkerGroup(workerGroup)
                    .stream()
                    .map(workerMapping::get)
                    .filter(Objects::nonNull)
                    .map(WorkerServerMetadata::getAddress)
                    .collect(Collectors.toList());
            synchronized (dbWorkerGroupMapping) {
                dbWorkerGroupMapping.put(workerGroup.getName(), activeWorkers);
            }
        }
    }

when master started before worker, onWorkerGroupChange will not add the worker which not started yet, so even the worker started, onServerAdded only affect workerMapping and dbWorkerGroupMapping will not changed, so if not editing worker-group again, the balancer will cannot find the worker.

What you expected to happen

not to filter inactive workers or call onWorkerGroupChange in onServerAdded

How to reproduce

  1. set worker group on UI
  2. stop masters & workers
  3. start master and wait startup
  4. start worker
  5. start a task, the task will cannot find the worker.

Anything else

No response

Version

dev

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Labels

backendbugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions