Skip to content

Conversation

@ArlindKadra
Copy link
Member

Issue #431

@ArlindKadra ArlindKadra requested a review from mfeurer May 15, 2018 10:09
Copy link
Collaborator

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to not remove all lines you touched?

np.array(repetitions[repetition][fold][sample][0], dtype=np.int32),
np.array(repetitions[repetition][fold][sample][1], dtype=np.int32))

'''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyCharm usually wants double quotes, right?

@ArlindKadra
Copy link
Member Author

@mfeurer since you mentioned removing or fixing split pickling, I only commented out the code as a simple solution. I can remove it completely if you are ok with it. @janvanrijn we are using double quotes for docstrings.

@mfeurer
Copy link
Collaborator

mfeurer commented May 16, 2018

I think removing it is fine. Reading those files isn't as costly as reading in the data.

@ArlindKadra ArlindKadra requested a review from mfeurer May 16, 2018 12:51
Copy link
Collaborator

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please remove the if cache statements and the cache=True. I think they're not necessary and complicate the code quite a bit.

@ArlindKadra
Copy link
Member Author

@mfeurer sure, but we will not have a way to disable loading from cache.

@mfeurer
Copy link
Collaborator

mfeurer commented May 16, 2018

I don't think we have this for anything else. As a split should never change I don't see a reason to keep this functionality.

@ArlindKadra
Copy link
Member Author

@mfeurer while working in the java repo, there was a case when the server was having problems and it was returning a bad value for a certain request. The value was cached and it would fail a procedure we were running as it would run on the cached object. In this case the flag was necessary (apart from the fact that the files could be removed locally). However given that we validate the object, this should not happen.

@mfeurer
Copy link
Collaborator

mfeurer commented May 17, 2018

Yes, we have this piece of code for downloading the task:

        try:
            task = _get_task_description(task_id)
            dataset = get_dataset(task.dataset_id)
            class_labels = dataset.retrieve_class_labels(task.target_name)
            task.class_labels = class_labels
            task.download_split()
        except Exception as e:
            openml.utils._remove_cache_dir_for_id(TASKS_CACHE_DIR_NAME, tid_cache_dir)
            raise e

One last thing, could you please remove the try/except around the pickle file loading. We don't react on any exception, so the try/except is actually useless.

@mfeurer mfeurer merged commit 805059d into develop May 18, 2018
@mfeurer mfeurer deleted the fix431 branch May 18, 2018 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants