Skip to content

Be more lenient with feature type when determining class labels #1311

@PGijsbers

Description

@PGijsbers

For a supervised classification task, the task.class_labels is determined automatically here:

for feature in self.features.values():
if (feature.name == target_name) and (feature.data_type == "nominal"):
return feature.nominal_values

Sometimes people are not very meticulous when creating datasets, and the feature type may be listed as string instead of nominal, which means that task.class_labels will be None.
A simple work-around would be to add a case where feature.data_type == 'string' and then fetch the unique values from the column. It might be worth it to encourage users to fix the feature type of the dataset, but unfortunately the only way to do that is 1) being the dataset owner or 2) creating an entirely new version of the dataset (and thus also requires a new task).

We should consider giving a warning, maybe, but honestly this probably should be fixed on task creation (i.e., say that the target is invalid for a classification task if the feature type is string and not nominal).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions