Skip to content

additional metric names should not include objective metric name #1873

@henrysecond1

Description

@henrysecond1

/kind bug

What steps did you take and what happened:

Deploy an experiment which spec.objective.additionalMetricNames contains spec.objective.objectiveMetricName.

The experiment is created, but you can see nothing on the experiment UI, like below.

  • You can see the log GetLastConditionType failed: Experiment doesn't have any condition in the Katib UI pod.
  • Also in katib controller, you can see the log "admission webhook \"validator.experiment.katib.kubeflow.org\" denied the request: only spec.parallelTrialCount, spec.maxTrialCount and spec.maxFailedTrialCount are editable",.

After some investigation, we found that the experiment is created, but failed to update when the additionalMetricNames contains objectiveMetricName.

Detailed investigation below

This is because of the behavior of the mutating & validation webhook.

If additionalMetricNames contains objectiveMetricName, mutating webhook tries to add objectiveMetricName to metricStrategies like below.

So after the mutation, metric strategies look like below.

    metricStrategies:
    - name: Validation-accuracy
      value: max
    - name: Train-accuracy
      value: max
    - name: Validation-accuracy
      value: max

Another mutation happens when the controller update finalizer

However, it is not allowed to update metricsStrategies when oldInst is not nil

So the updating experiment is failed and status of experiment will be never updated

What did you expect to happen:

The experiment should not be created when additionalMetricNames contains objectiveMetricName.

Environment:

  • Katib version (check the Katib controller image version): v0.13

Impacted by this bug? Give it a 👍 We prioritize the issues with the most 👍

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions