-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Auto ML assets #25466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto ML assets #25466
Conversation
5a2d3b5 to
6f56cfe
Compare
|
Errors :( |
cf3d1c1 to
93ef028
Compare
|
Rebased to acount for Flask 2.2 errors fixed yesterday. |
tests/system/providers/google/cloud/automl/example_automl_dataset.py
Outdated
Show resolved
Hide resolved
fdb3996 to
7f0305d
Compare
|
there is a csv file of 45K char , is that normal ? |
|
Yeah 45K lines of .csv file is NOT something we want. Few options:
|
This .csv is needed for training an AutoML model, in order to start the training .csv should consist more then 1000 rows. For our test I can reduce the file to 2100 rows. @potiuk what do you think about reducing the file size? |
107f390 to
89c2f7c
Compare
@potiuk Catching attention :) I think 2100 is okayish (not the best but certainly better than 50k). Please comment if you still think it should be stored in the external storage. |
|
Can we compress it (and dynamically decompress during test?). Just zipping it is 20K instead of 160K. This file is unlikely to ever change and it is cimpletely uninteresting to see what's in when you review the cod, so there is no particular reason to keep text file in Git. It's not only the size that matters in this case. Keeping it plain text has this really nasty effect that it when you search something in the source code in your IDE, you will find some matching words here likely, so keeping the file uncompressed make it very prone to falling search&replace victim, |
89c2f7c to
f498f06
Compare
@potiuk I have done it |
|
Sorry for delay - been a bit busy. No, It's not compressed - it's just bundled in .tar now not .zipped (.tar-ing single file kinda make no sense) . Stil takes 170 instead of 20K (and this PR needs rebase anyway). |
6adb962 to
076d91a
Compare
|
conflicts need to be resolved after string normalisation |
b1dbfa0 to
3616fd2
Compare
3616fd2 to
a0c5ee8
Compare
|
Rebased to rebuild. |
cfb9f8b to
9b863c6
Compare
9b863c6 to
4c1abb2
Compare
|
Tests failing. |
c1bc806 to
0c94ca8
Compare
|
static check failures. |
This reverts commit 7f0305d80ad162ee4e17a85870e88bdad5f27b18.
|
REbased - static checks fixed in main (mysql python connector release breaking mypy) |
|
@potiuk I think that PR can be merged. I can't do that because I am not the author of PR and I don't have write access |
I have created links and updated system tests for Auto ML operators.
Co-authored-by: Wojciech Januszek [email protected]
Co-authored-by: Lukasz Wyszomirski [email protected]
Co-authored-by: Maksim Yermakou [email protected]
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.