Issue 1028: public delete functions for run, task, flow and database #1060

Mirkazemi · 2021-04-20T21:48:48Z

Reference Issue

Public delete methods #1028

What does this PR implement/fix? Explain your changes.

The new functions allow the user to delete run, task, flow and database using API:
openml.runs.delete_run()
openml.tasks.delete_task()
openml.flows.delete_flow()
openml.datasets.delete_dataset()

How should this PR be tested?

The unit tests are added to the
tests/test_runs/test_run_functions.py
tests/test_tasks/test_task_functions.py
tests/test_flows/test_flow_functions.py
tests/test_datasets/test_dataset_functions.py

PGijsbers

@mfeurer Should we want additional error wrapping?

E.g. deleting a task with runs says:

>>> openml.tasks.delete_task(4263)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "E:\repositories\openml-python\openml\tasks\functions.py", line 537, in delete_task
    return openml.utils._delete_entity("task", task_id)
  File "E:\repositories\openml-python\openml\utils.py", line 175, in _delete_entity
    result_xml = openml._api_calls._perform_api_call(url_suffix, "delete")
  File "E:\repositories\openml-python\openml\_api_calls.py", line 61, in _perform_api_call
    response = __read_url(url, request_method, data)
  File "E:\repositories\openml-python\openml\_api_calls.py", line 160, in __read_url
    return _send_request(
  File "E:\repositories\openml-python\openml\_api_calls.py", line 192, in _send_request
    __check_response(response=response, url=url, file_elements=files)
  File "E:\repositories\openml-python\openml\_api_calls.py", line 230, in __check_response
    raise __parse_server_exception(response, url, file_elements=file_elements)
openml.exceptions.OpenMLServerException: https://www.openml.org/api/v1/xml/task/4263 returned code 454: Task is executed in some runs. Delete these first - None

Deleting something not owned by you:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "E:\repositories\openml-python\openml\runs\functions.py", line 1208, in delete_run
    return openml.utils._delete_entity("run", run_id)
  File "E:\repositories\openml-python\openml\utils.py", line 175, in _delete_entity
    result_xml = openml._api_calls._perform_api_call(url_suffix, "delete")
  File "E:\repositories\openml-python\openml\_api_calls.py", line 61, in _perform_api_call
    response = __read_url(url, request_method, data)
  File "E:\repositories\openml-python\openml\_api_calls.py", line 160, in __read_url
    return _send_request(
  File "E:\repositories\openml-python\openml\_api_calls.py", line 192, in _send_request
    __check_response(response=response, url=url, file_elements=files)
  File "E:\repositories\openml-python\openml\_api_calls.py", line 230, in __check_response
    raise __parse_server_exception(response, url, file_elements=file_elements)
openml.exceptions.OpenMLServerException: https://www.openml.org/api/v1/xml/run/10559886 returned code 393: Run is not owned by you - None

The messages are pretty clear though the stack trace perhaps a bigger than expected by an end-user. We could also use e.g. PermissionError for trying to delete others' entities.

@Mirkazemi Please include tests for common errors (deleting something that is not yours, deleting an entity with attached other entities (e.g. task with runs)). I suggest you wait with this until Matthias has chimed in on whether or not the exceptions should be wrapped.

PGijsbers · 2021-04-21T09:45:26Z

tests/test_runs/test_run_functions.py

+
+        run = openml.runs.run_model_on_task(model=clf, task=task, seed=rs)
+        run.publish()
+        TestBase._mark_entity_for_removal("run", run.run_id)


I suppose we want to mark the entities in other tests for removal as well - even though the tests should remove them. In case we introduce some error in the delete functions the entities will still be cleaned from the test server.

mfeurer · 2021-04-21T11:21:53Z

I agree, we could add two new exceptions for the cases you mentioned above to make it easier for the user to understand what's going on and automatically parse the exceptions.

PGijsbers · 2021-10-22T08:29:45Z

Hi @Mirkazemi! I hope you are well :) If you're planning to continue working on this PR please let us know.

PGijsbers · 2023-03-06T17:06:52Z

@mfeurer Can you have a quick look at the progress? In particular the testing against cached xml responses from the server. See for example the flow tests, which feature both an integration test that Mirkazemi had added, but below that also multiple new tests against cached responses.
The downside of the integrations tests are plenty:

they can be flaky as they depend on server stability
they are slower (by nature of having to communicate with the server)
and require either an extensive setup to keep independent from other tests (e.g., creating both a new task and a run to test for task can not be deleted because of associated runs), or would create dependencies (e.g., trying to delete already existing tasks/tasks generated by other tests, which is both hard to orchestrate and in case of server bugs might even result in unintended loss of data).

With the new server implementation, the server tests themselves will be responsible for ensuring the right behaviour for provided input. However, this will take a while to roll out (and for openml-python to use), so there is also a pro to actually doing integration tests (the openml-python tests are also used as server tests for now). My questions are thus:

To what extend do you think we should keep/add integration tests (if any)?
What do you think about the mocked xml response tests, and how they are currently structured?

tests/conftest.py

tests/test_datasets/test_dataset_functions.py

mfeurer

A few minor things, but besides that, this PR looks good to me.

tests/test_datasets/test_dataset_functions.py

tests/test_flows/test_flow_functions.py

PGijsbers · 2023-03-15T09:31:14Z

Thanks! I'll process the feedback later and add similar tests for the other entities. I'll ping you when I'm done :)

… into issue_1028

codecov-commenter · 2023-03-21T08:18:46Z

Codecov Report

Patch coverage: 97.61% and project coverage change: +0.07 🎉

Comparison is base (24cbc5e) 85.16% compared to head (54cfcda) 85.24%.

❗ Current head 54cfcda differs from pull request most recent head d725c1a. Consider uploading reports for the commit d725c1a to get more accurate results

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1060      +/-   ##
===========================================
+ Coverage    85.16%   85.24%   +0.07%     
===========================================
  Files           38       38              
  Lines         4981     5008      +27     
===========================================
+ Hits          4242     4269      +27     
  Misses         739      739

Impacted Files	Coverage Δ
openml/datasets/__init__.py	`100.00% <ø> (ø)`
openml/runs/__init__.py	`100.00% <ø> (ø)`
openml/tasks/__init__.py	`100.00% <ø> (ø)`
openml/utils.py	`91.25% <93.33%> (+0.03%)`	⬆️
openml/_api_calls.py	`86.44% <100.00%> (-0.08%)`	⬇️
openml/datasets/functions.py	`90.16% <100.00%> (+0.05%)`	⬆️
openml/exceptions.py	`96.66% <100.00%> (-0.11%)`	⬇️
openml/flows/__init__.py	`100.00% <100.00%> (ø)`
openml/flows/functions.py	`84.74% <100.00%> (+0.17%)`	⬆️
openml/runs/functions.py	`84.26% <100.00%> (+0.31%)`	⬆️
... and 2 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

PGijsbers · 2023-03-21T08:49:10Z

Thanks @Mirkazemi for starting this PR 🎉

PGijsbers reviewed Apr 21, 2021

View reviewed changes

PGijsbers self-assigned this Feb 23, 2023

Mirkazemi and others added 18 commits February 23, 2023 15:34

adding .vscode to .gitignore

bd33fc1

Adding delete_dataset() to datasets/functions.py

4145ff9

Adding delete_dataset() to datasets/__init__.py

4320c89

Adding delete_flow() to flows/functions.py

06e4a03

Adding delete_flow to flows/__init__.py

5ad6af6

Adding delete_task() to task/functions.py

83b1bf1

Adding delete_task() to tasks/__init__.py

9567927

Adding delete_run() to runs/functions.py

f8c0f7d

Adding delete_run() to runs/__init__.py

9308a5e

Adding 'test_delete_run()' to the unit test

a3b1688

adding test_delete_flow to the unit tests

33f0f37

Correction of test_delete_flow unit test

c2cfeaa

Add test_delete_dataset() unit test for deleting database

e9ade63

pre-commit changes

993b12a

adding delete functions for run, task, flow and database to the api.rst

2030127

Adding delete_task() to task/functions.py

bb922a5

pre-commit changes

0c18ab9

[no ci] Update docstring to specify when the entity can be deleted

ebcd522

PGijsbers force-pushed the issue_1028 branch from 880a597 to ebcd522 Compare February 23, 2023 14:34

PGijsbers added 7 commits February 24, 2023 09:58

Add responses from OpenML Server for task delete

7d0ce66

Add OpenMLUnauthorizedError for fail on authenticated request

1210f11

Improve error messaging for expected errors

09db140

Add tests for task delete

5e061d3

Improve documentation and test for forwarding server errors

bba3cdf

Document unwrapped error codes and add explicit 455 error

926f146

Add test for attempting to delete a task that doesnt exist

f4fc886

PGijsbers added 7 commits March 3, 2023 17:07

Move cached server responses to dedicated directory

ebc5630

Move exception wrapping down one level

7efb78c

Centralize error wrapping for delete calls

7ab3d26

Switch to Pytest style for new tests

610b0a3

Add dataset delete tests

96869c4

Add flow delete tests

0d387b3

Extract response creation

230035b

mfeurer reviewed Mar 13, 2023

View reviewed changes

tests/conftest.py Show resolved Hide resolved

tests/test_datasets/test_dataset_functions.py Outdated Show resolved Hide resolved

HTTP status code should always be 412, openml code is in xml

cff3c46

PGijsbers force-pushed the issue_1028 branch from eb51503 to cff3c46 Compare March 14, 2023 10:22

mfeurer reviewed Mar 14, 2023

View reviewed changes

HTTP status code should always be 412, openml code is in xml

382d1eb

PGijsbers added 7 commits March 17, 2023 11:54

Rename mock_get to mock_delete

b007127

Add test server api key as fixture

fef5cbe

Merge branch 'issue_1028' of https://github.com/Mirkazemi/openml-python…

8a33619

… into issue_1028

Add run delete tests

d387216

Merge branch 'develop' into issue_1028

7dc3cdf

Remove one of duplicate pytest import statement

54cfcda

[no ci] Deleting tasks, runs, datasets, and flows

d725c1a

mfeurer approved these changes Mar 21, 2023

View reviewed changes

PGijsbers merged commit 3c00d7b into openml:develop Mar 21, 2023

Uh oh!

Issue 1028: public delete functions for run, task, flow and database #1060

Issue 1028: public delete functions for run, task, flow and database #1060

Uh oh!

Conversation

Mirkazemi commented Apr 20, 2021

Reference Issue

What does this PR implement/fix? Explain your changes.

How should this PR be tested?

Uh oh!

PGijsbers left a comment

Choose a reason for hiding this comment

Uh oh!

PGijsbers Apr 21, 2021

Choose a reason for hiding this comment

Uh oh!

mfeurer commented Apr 21, 2021

Uh oh!

PGijsbers commented Oct 22, 2021

Uh oh!

PGijsbers commented Mar 6, 2023

Uh oh!

Uh oh!

Uh oh!

mfeurer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

PGijsbers commented Mar 15, 2023

Uh oh!

codecov-commenter commented Mar 21, 2023

Codecov Report

Uh oh!

PGijsbers commented Mar 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants