Skip to content
This repository was archived by the owner on May 22, 2025. It is now read-only.

Conversation

@feluelle
Copy link
Member

Description

What is the current behavior?

Currently, we don't have any logging for native nor pandas load_file. This makes it harder for users to follow the operations.

related: #1263

What is the new behavior?

We now have log statements for native, pandas as well as fallback indication.

Does this introduce a breaking change?

No.

Checklist

  • Created tests which fail without the change (if possible)
  • Extended the README / documentation, if necessary

Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@feluelle would it make sense to have a test covering this change? Approving it regardless, since the change already helps users.

@codecov
Copy link

codecov bot commented Nov 28, 2022

Codecov Report

Base: 94.91% // Head: 95.71% // Increases project coverage by +0.79% 🎉

Coverage data is based on head (da7b0d3) compared to base (0f07504).
Patch has no changes to coverable lines.

❗ Current head da7b0d3 differs from pull request most recent head 8574062. Consider uploading reports for the commit 8574062 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1312      +/-   ##
==========================================
+ Coverage   94.91%   95.71%   +0.79%     
==========================================
  Files          52       19      -33     
  Lines        2617      677    -1940     
  Branches      311       68     -243     
==========================================
- Hits         2484      648    -1836     
+ Misses         82       18      -64     
+ Partials       51       11      -40     
Impacted Files Coverage Δ
python-sdk/src/astro/databases/base.py
python-sdk/src/astro/exceptions.py
python-sdk/src/astro/files/types/__init__.py
python-sdk/src/astro/utils/table.py
python-sdk/src/astro/constants.py
python-sdk/src/astro/files/locations/base.py
python-sdk/src/astro/files/locations/__init__.py
python-sdk/src/astro/sql/operators/merge.py
python-sdk/src/astro/files/types/ndjson.py
python-sdk/src/astro/files/__init__.py
... and 61 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Collaborator

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One test failed for Airflow 2.4 and 2.2.5:

google.api_core.exceptions.BadRequest: 400 Provided Schema does not match Table astronomer-dag-authoring:first_table_schema._tmp_gcuqy9jbp3wxal8q0uqwb8jrk2a2zr6zage4t0g. Field id has changed mode from REQUIRED to NULLABLE

@pankajkoti
Copy link
Contributor

pankajkoti commented Nov 29, 2022

One test failed for Airflow 2.4 and 2.2.5:

google.api_core.exceptions.BadRequest: 400 Provided Schema does not match Table astronomer-dag-authoring:first_table_schema._tmp_gcuqy9jbp3wxal8q0uqwb8jrk2a2zr6zage4t0g. Field id has changed mode from REQUIRED to NULLABLE

I checked this and seems that it is because we explicitly specify ID is primary key in the target_table_1 & target_table_2 and hence it becomes REQUIRED. Whereas in the source_table, we directly load data from file without specifying the schema.

Screenshot 2022-11-29 at 1 38 23 PM

As a result, the schema for the target tables have ID as REQUIRED:
Screenshot 2022-11-29 at 1 40 22 PM

whereas, the source_table has ID as NULLABLE:
Screenshot 2022-11-29 at 1 40 59 PM

So perhaps should we add the columns key in the source Table to specify that ID is a primary key?

@pankajkoti pankajkoti force-pushed the feature/improve-logging-load-file branch from da7b0d3 to 8574062 Compare November 29, 2022 08:42
@pankajkoti
Copy link
Contributor

I tried doing that(specifying columns), but got the following error, so reverting my commit.

Screenshot 2022-11-29 at 2 12 42 PM

@phanikumv phanikumv merged commit 279c7d1 into main Nov 29, 2022
@phanikumv phanikumv deleted the feature/improve-logging-load-file branch November 29, 2022 10:04
kaxil pushed a commit that referenced this pull request Nov 29, 2022
#1324)

# Description
## What is the current behavior?
<!-- Please describe the current behavior that you are modifying. -->
- Add the `NATIVE_LOAD_EXCEPTIONS` to BaseDatabase as it is being
overloaded.

<!--
Issues are required for both bug fixes and features.
Reference it using one of the following:

closes: #ISSUE
related: #ISSUE
-->
related to #1312 as change
was not correct


## What is the new behavior?
<!-- Please describe the behavior or changes that are being added by
this PR. -->

- Add the `NATIVE_LOAD_EXCEPTIONS` to BaseDatabase as it is being
overloaded.

## Does this introduce a breaking change?
No

### Checklist
- [x] Created tests which fail without the change (if possible)
- [x] Extended the README / documentation, if necessary
sunank200 pushed a commit that referenced this pull request Dec 1, 2022
# Description

## What is the current behavior?

Currently, we don't have any logging for native nor pandas _load_file_.
This makes it harder for users to follow the operations.

related: #1263

## What is the new behavior?

We now have log statements for native, pandas as well as fallback
indication.

## Does this introduce a breaking change?

No.

### Checklist
- [ ] Created tests which fail without the change (if possible)
- [ ] Extended the README / documentation, if necessary

(cherry picked from commit 279c7d1)
sunank200 added a commit that referenced this pull request Dec 1, 2022
#1324)

# Description
## What is the current behavior?
<!-- Please describe the current behavior that you are modifying. -->
- Add the `NATIVE_LOAD_EXCEPTIONS` to BaseDatabase as it is being
overloaded.

<!--
Issues are required for both bug fixes and features.
Reference it using one of the following:

closes: #ISSUE
related: #ISSUE
-->
related to #1312 as change
was not correct

## What is the new behavior?
<!-- Please describe the behavior or changes that are being added by
this PR. -->

- Add the `NATIVE_LOAD_EXCEPTIONS` to BaseDatabase as it is being
overloaded.

## Does this introduce a breaking change?
No

### Checklist
- [x] Created tests which fail without the change (if possible)
- [x] Extended the README / documentation, if necessary

(cherry picked from commit 0c64f48)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants