Be more lenient with feature type when determining class labels

For a supervised classification task, the `task.class_labels` is determined automatically here:

https://github.com/openml/openml-python/blob/326bf0b877696cbb1004a173b0b2fe0e09557e24/openml/datasets/dataset.py#L911-L913

Sometimes people are not very meticulous when creating datasets, and the feature type may be listed as `string` instead of `nominal`, which means that `task.class_labels` will be `None`.
A simple work-around would be to add a case where `feature.data_type == 'string'` and then fetch the unique values from the column. It might be worth it to encourage users to fix the feature type of the dataset, but unfortunately the only way to do that is 1) being the dataset owner or 2) creating an entirely new version of the dataset (and thus also requires a new task).

We should consider giving a warning, maybe, but honestly this probably should be fixed on task creation (i.e., say that the target is invalid for a classification task if the feature type is `string` and not `nominal`).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Be more lenient with feature type when determining class labels #1311

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	for feature in self.features.values():
	if (feature.name == target_name) and (feature.data_type == "nominal"):
	return feature.nominal_values

Uh oh!

Be more lenient with feature type when determining class labels #1311

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions