Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: openml/openml-python
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.14.2
Choose a base ref
...
head repository: openml/openml-python
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.15.0
Choose a head ref
  • 14 commits
  • 27 files changed
  • 4 contributors

Commits on Jan 22, 2024

  1. [pre-commit.ci] pre-commit autoupdate (#1325)

    updates:
    - [github.com/astral-sh/ruff-pre-commit: v0.1.13 → v0.1.14](astral-sh/ruff-pre-commit@v0.1.13...v0.1.14)
    
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
    pre-commit-ci[bot] authored Jan 22, 2024
    Configuration menu
    Copy the full SHA
    90380d4 View commit details
    Browse the repository at this point in the history

Commits on May 15, 2024

  1. read file in read mode (#1338)

    * read file in read mode
    
    * cast parameters to expected types
    
    * Following PGijsbers proposal to ensure that avoid_duplicate_runs is a boolean after reading it from config_file
    
    * Add a test, move parsing of avoid_duplicate_runs
    
    ---------
    
    Co-authored-by: PGijsbers <[email protected]>
    BrunoBelucci and PGijsbers authored May 15, 2024
    Configuration menu
    Copy the full SHA
    923b49d View commit details
    Browse the repository at this point in the history

Commits on Jul 5, 2024

  1. Fix/sklearn test compatibility (#1340)

    * Update 'sparse' parameter for OHE for sklearn >= 1.4
    
    * Add compatability or skips for sklearn >= 1.4
    
    * Change 'auto' to 'sqrt' for sklearn>1.3 as 'auto' is deprecated
    
    * Skip flaky test
    
    It is unclear how a condition where the test is supposed to pass
    is created. Even after running the test suite 2-3 times, it does
    not yet seem to pass.
    
    * Fix typo
    
    * Ignore description comparison for newer scikit-learn
    
    There are some minor changes to the docstrings. I do not know
    that it is useful to keep testing it this way, so for now I will
    disable the test on newer versions.
    
    * Adjust for scikit-learn 1.3
    
    The loss has been renamed.
    The performance of the model also seems to have changed slightly
    for the same seed. So I decided to compare with the lower fidelity
    that was already used on Windows systems.
    
    * Remove timeout and reruns to better investigate CI failures
    
    * Fix typo in parametername
    
    * Add jobs for more recent scikit-learns
    
    * Expand the matrix with all scikit-learn 1.x versions
    
    * Fix for numpy2.0 compatibility (#1341)
    
    Numpy2.0 cleaned up their namespace.
    
    * Rewrite matrix and update numpy compatibility
    
    * Move comment in-line
    
    * Stringify name of new step to see if that prevented the action
    
    * Fix unspecified os for included jobs
    
    * Fix typo in version pinning for numpy
    
    * Fix version specification for sklearn skips
    
    * Output final list of installed packages for debugging purposes
    
    * Cap scipy version for older versions of scikit-learn
    
    There is a breaking change to the way 'mode' works, that
    breaks scikit-learn internals.
    
    * Update parameter base_estimator to estimator for sklearn>=1.4
    
    * Account for changes to sklearn interface in 1.4 and 1.5
    
    * Non-strict reinstantiation requires different scikit-learn version
    
    * Parameters were already changed in 1.4
    
    * Fix race condition (I think)
    
    It seems to me that run.evaluations is set only when the run is
    fetched. Whether it has evaluations depends on server state.
    So if the server has resolved the traces between the initial
    fetch and the trace-check, you could be checking
    len(run.evaluations) where evaluations is None.
    
    * Use latest patch version of each minor release
    
    * Convert numpy types back to builtin types
    
    Scikit-learn or numpy changed the typing of the parameters
    (seen in a masked array, not sure if also outside of that).
    Convert these values back to Python builtins.
    
    * Specify versions with * instead to allow for specific patches
    
    * Flow_exists does not return None but False is the flow does not exist
    
    * Update new version definitions also installation step
    
    * Fix bug introduced in refactoring for np.generic support
    
    We don't want to serialize as the value np.nan, we want to
    include the nan directly. It is an indication that the
    parameter was left unset.
    
    * Add back the single-test timeout of 600s
    
    * [skip ci] Add note to changelog
    
    * Check that evaluations are present with None-check instead
    
    The default behavior if no evaluation is present is for it
    to be None. So it makes sense to check for that instead.
    As far as I can tell, run.evaluations should always contain
    some items if it is not None. But I added an assert just in
    case.
    
    * Remove timeouts again
    
    I suspect they "crash" workers. This of course introduces the risk of hanging processes... But I cannot reproduce the issue locally.
    PGijsbers authored Jul 5, 2024
    Configuration menu
    Copy the full SHA
    532be7b View commit details
    Browse the repository at this point in the history

Commits on Jul 10, 2024

  1. Add HTTP headers to all requests (#1342)

    * Add HTTP headers to all requests
    
    This allows us to better understand the traffic we see to our API.
    It is not identifiable to a person.
    
    * Update unit test to pass even with user-agent in header
    PGijsbers authored Jul 10, 2024
    Configuration menu
    Copy the full SHA
    e4e6f50 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    de983ac View commit details
    Browse the repository at this point in the history

Commits on Sep 16, 2024

  1. Fix/1349 (#1350)

    * Add packaging dependency
    
    * Change use of distutils to packaging
    
    * Update missed usage of distutils to packaging
    
    * Inline comparison to clear up confusion
    PGijsbers authored Sep 16, 2024
    Configuration menu
    Copy the full SHA
    fa7e9db View commit details
    Browse the repository at this point in the history
  2. Lazy arff (#1346)

    * Prefer parquet over arff, do not load arff if not needed
    
    * Only download arff if needed
    
    * Test arff file is not set when downloading parquet from prod
    PGijsbers authored Sep 16, 2024
    Configuration menu
    Copy the full SHA
    b4d038f View commit details
    Browse the repository at this point in the history
  3. Feat/progress (#1335)

    * Add progress bar to downloading minio files
    
    * Do not redownload cached files
    
    There is now a way to force a cache clear, so always redownloading
    is not useful anymore.
    
    * Set typed values on dictionary to avoid TypeError from Config
    
    * Add regression test for parsing booleans
    PGijsbers authored Sep 16, 2024
    Configuration menu
    Copy the full SHA
    1d707e6 View commit details
    Browse the repository at this point in the history

Commits on Sep 22, 2024

  1. Add/1034 (#1352) dataset lazy loading default

    * Towards lazy-by-default for dataset loading
    
    * Isolate lazy behavior to pytest function outside of class
    
    * Solve concurrency issue where test would use same cache
    
    * Ensure metadata is downloaded to verify dataset is processed
    
    * Clean up to reflect new defaults and tests
    
    * Fix oversight from 1335
    
    * Download data as was 0.14 behavior
    
    * Restore test
    
    * Formatting
    
    * Test obsolete, replaced by test_get_dataset_lazy_behavior
    PGijsbers authored Sep 22, 2024
    Configuration menu
    Copy the full SHA
    07e9b9c View commit details
    Browse the repository at this point in the history

Commits on Sep 27, 2024

  1. Make test insensitive to OrderedDict stringification (#1353)

    Sometime between 3.9 and 3.12 the stringification of
    ordered dicts changed from using a list of tuples to
    a dictionary.
    PGijsbers authored Sep 27, 2024
    Configuration menu
    Copy the full SHA
    7764ddb View commit details
    Browse the repository at this point in the history

Commits on Sep 29, 2024

  1. Remove archive after it is extracted to save disk space (#1351)

    * Remove archive after it is extracted to save disk space
    
    * Leave a marker after removing archive to avoid redownload
    
    * Automatic refresh if expected marker is absent
    
    * Be consistent about syntax use for path construction
    PGijsbers authored Sep 29, 2024
    Configuration menu
    Copy the full SHA
    a3e57bb View commit details
    Browse the repository at this point in the history
  2. Pass kwargs through task to get_dataset (#1345)

    * Pass kwargs through task to ```get_dataset```
    
    Allows to follow the directions in the warning ```Starting from Version 0.15 `download_data`, `download_qualities`, and `download_features_meta_data` will all be ``False`` instead of ``True`` by default to enable lazy loading.```
    
    * docs: explain that ```task.get_dataset``` passes kwargs
    
    * Update openml/tasks/task.py
    
    Remove Py3.8+ feature for backwards compatibility
    
    ---------
    
    Co-authored-by: Pieter Gijsbers <[email protected]>
    knyazer and PGijsbers authored Sep 29, 2024
    Configuration menu
    Copy the full SHA
    d37542b View commit details
    Browse the repository at this point in the history

Commits on Oct 1, 2024

  1. Change defaults for get_task to be lazy (#1354)

    * Change defaults for `get_task`
    
    * [pre-commit.ci] auto fixes from pre-commit.com hooks
    
    for more information, see https://pre-commit.ci
    
    * Fix linting errors
    
    * Add missing type annotation
    
    ---------
    
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
    PGijsbers and pre-commit-ci[bot] authored Oct 1, 2024
    Configuration menu
    Copy the full SHA
    a55a3fc View commit details
    Browse the repository at this point in the history

Commits on Oct 4, 2024

  1. Release/0.15.0 (#1355)

    * Expand 0.15.0 changelog with other PRs not yet added
    
    * Bump version number
    
    * Add newer Python versions since we are compatible
    
    * Revert "Add newer Python versions since we are compatible"
    
    This reverts commit 5088c80.
    
    * Add newer compatible versions of Python
    PGijsbers authored Oct 4, 2024
    Configuration menu
    Copy the full SHA
    dea8724 View commit details
    Browse the repository at this point in the history
Loading