Skip to content

Conversation

@tswast
Copy link
Collaborator

@tswast tswast commented Nov 11, 2021

deps: require pandas 0.24+ and db-dtypes for TIME/DATE extension dtypes

Review #420 first! This PR is based on changes to the system tests introduced there.

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #421 🦕

@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Nov 11, 2021
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-pandas API. label Nov 11, 2021
@tswast tswast marked this pull request as ready for review November 11, 2021 22:33
@tswast tswast requested a review from a team as a code owner November 11, 2021 22:33
@tswast
Copy link
Collaborator Author

tswast commented Nov 11, 2021

Need to package db-dtypes package in order to pass conda session.

Copy link

@plamut plamut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few things worth double-checking, but the general picture looks good.

@tswast
Copy link
Collaborator Author

tswast commented Nov 17, 2021

nox > Running session system-3.7
nox > Creating virtual environment (virtualenv) using python3.7 in .nox/system-3-7
nox > python -m pip install --pre grpcio
nox > python -m pip install mock pytest google-cloud-testutils -c /tmpfs/src/github/python-bigquery-pandas/testing/constraints-3.7.txt
nox > python -m pip install -e .[tqdm] -c /tmpfs/src/github/python-bigquery-pandas/testing/constraints-3.7.txt
nox > py.test --quiet --junitxml=system_3.7_sponge_log.xml tests/system
.sss......F.FF.........................................................s [ 73%]
..........................                                               [100%]
=================================== FAILURES ===================================
_____ TestReadGBQIntegration.test_should_properly_handle_nullable_integers _____

self = <system.test_gbq.TestReadGBQIntegration object at 0x7f550a5fe910>
project_id = 'precise-truck-742'

    def test_should_properly_handle_nullable_integers(self, project_id):
        if PANDAS_VERSION < NULLABLE_INT_PANDAS_VERSION:
            pytest.skip(msg=NULLABLE_INT_MESSAGE)
    
        query = """SELECT * FROM
                    UNNEST([1, NULL]) AS nullable_integer
                """
        df = gbq.read_gbq(
            query,
            project_id=project_id,
            credentials=self.credentials,
            dialect="standard",
            dtypes={"nullable_integer": "Int64"},
        )
        tm.assert_frame_equal(
            df,
            DataFrame(
>               {"nullable_integer": pandas.Series([1, pandas.NA], dtype="Int64")}
            ),
        )
E       AttributeError: module 'pandas' has no attribute 'NA'

tests/system/test_gbq.py:192: AttributeError
----------------------------- Captured stderr call -----------------------------

Downloading: 0rows [00:00, ?rows/s]
Downloading: 100%|██████████| 2/2 [00:00<00:00,  5.88rows/s]
______ TestReadGBQIntegration.test_should_properly_handle_nullable_longs _______

self = <system.test_gbq.TestReadGBQIntegration object at 0x7f550a580bd0>
project_id = 'precise-truck-742'

    def test_should_properly_handle_nullable_longs(self, project_id):
        if PANDAS_VERSION < NULLABLE_INT_PANDAS_VERSION:
            pytest.skip(msg=NULLABLE_INT_MESSAGE)
    
        query = """SELECT * FROM
                    UNNEST([1 << 62, NULL]) AS nullable_long
                """
        df = gbq.read_gbq(
            query,
            project_id=project_id,
            credentials=self.credentials,
            dialect="standard",
            dtypes={"nullable_long": "Int64"},
        )
        tm.assert_frame_equal(
            df,
            DataFrame(
>               {"nullable_long": pandas.Series([1 << 62, pandas.NA], dtype="Int64")}
            ),
        )
E       AttributeError: module 'pandas' has no attribute 'NA'

tests/system/test_gbq.py:223: AttributeError
----------------------------- Captured stderr call -----------------------------

Downloading: 0rows [00:00, ?rows/s]
Downloading: 100%|██████████| 2/2 [00:00<00:00, 10.43rows/s]
_______ TestReadGBQIntegration.test_should_properly_handle_null_integers _______

self = <system.test_gbq.TestReadGBQIntegration object at 0x7f550a5c5850>
project_id = 'precise-truck-742'

    def test_should_properly_handle_null_integers(self, project_id):
        if PANDAS_VERSION < NULLABLE_INT_PANDAS_VERSION:
            pytest.skip(msg=NULLABLE_INT_MESSAGE)
    
        query = "SELECT CAST(NULL AS INT64) AS null_integer"
        df = gbq.read_gbq(
            query,
            project_id=project_id,
            credentials=self.credentials,
            dialect="standard",
            dtypes={"null_integer": "Int64"},
        )
        tm.assert_frame_equal(
>           df, DataFrame({"null_integer": pandas.Series([pandas.NA], dtype="Int64")}),
        )
E       AttributeError: module 'pandas' has no attribute 'NA'

tests/system/test_gbq.py:240: AttributeError
----------------------------- Captured stderr call -----------------------------

Possible we need to adjust NULLABLE_INT_PANDAS_VERSION

@tswast tswast changed the title fix: allow strings when writing to DATE and floats when writing to NUMERIC fix: to_gbq allows strings for DATE and floats for NUMERIC, require pandas 0.24+ and db-dtypes Nov 17, 2021
@tswast tswast requested a review from plamut November 17, 2021 20:40
@tswast tswast added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Nov 17, 2021
@tswast
Copy link
Collaborator Author

tswast commented Nov 17, 2021

Marking as DO NOT MERGE as a reminder to make sure

deps: require pandas 0.24+ and db-dtypes for TIME/DATE extension dtypes
  Committer: @tswast
  Source-Link: googleapis/python-bigquery-pandas@19df618d9728eef07a9d70bca6d9600dc440ac63

is included as a footer to the PR, as I'd like to see if we can use this feature: googleapis/release-please#686

@tswast
Copy link
Collaborator Author

tswast commented Nov 17, 2021

Per googleapis/release-please#821, the extra metadata isn't important. It's that the extra commits are listed as a footer to the commit message without anything in-between.

@tswast tswast requested a review from loferris November 18, 2021 22:47
@tswast
Copy link
Collaborator Author

tswast commented Nov 18, 2021

Ready for (re)review.

Copy link

@plamut plamut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, thanks for the cleanup!

@plamut
Copy link

plamut commented Nov 20, 2021

The nightly test failure is probably flakiness? The logs say "killed"?

@tswast
Copy link
Collaborator Author

tswast commented Nov 22, 2021

Nightly failure isn't even flakey. It always fails because the conda package installation times out. #424

Switching to mamba for installs might help.

@tswast tswast merged commit 2180836 into main Nov 22, 2021
@tswast tswast deleted the issue421-numeric branch November 22, 2021 15:28
@tswast tswast mentioned this pull request Dec 1, 2021
gcf-merge-on-green bot pushed a commit that referenced this pull request Jan 19, 2022
🤖 I have created a release *beep* *boop*
---


## [0.17.0](v0.16.0...v0.17.0) (2022-01-19)


### ⚠ BREAKING CHANGES

* use nullable Int64 and boolean dtypes if available (#445)

### Features

* accepts a table ID, which downloads the table without a query ([#443](#443)) ([bf0e863](bf0e863))
* use nullable Int64 and boolean dtypes if available ([#445](#445)) ([89078f8](89078f8))


### Bug Fixes

* `read_gbq` supports extreme DATETIME values such as `0001-01-01 00:00:00` ([#444](#444)) ([d120f8f](d120f8f))
* `to_gbq` allows strings for DATE and floats for NUMERIC with `api_method="load_parquet"` ([#423](#423)) ([2180836](2180836))
* allow extreme DATE values such as `datetime.date(1, 1, 1)` in `load_gbq` ([#442](#442)) ([e13abaf](e13abaf))
* avoid iteritems deprecation in pandas prerelease ([#469](#469)) ([7379cdc](7379cdc))
* use data project for destination in `to_gbq` ([#455](#455)) ([891a00c](891a00c))


### Miscellaneous Chores

* release 0.17.0 ([#470](#470)) ([29ac8c3](29ac8c3))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
parthea pushed a commit to googleapis/google-cloud-python that referenced this pull request Sep 18, 2025
🤖 I have created a release *beep* *boop*
---


## [0.17.0](googleapis/python-bigquery-pandas@v0.16.0...v0.17.0) (2022-01-19)


### ⚠ BREAKING CHANGES

* use nullable Int64 and boolean dtypes if available (#445)

### Features

* accepts a table ID, which downloads the table without a query ([#443](googleapis/python-bigquery-pandas#443)) ([bf0e863](googleapis/python-bigquery-pandas@bf0e863))
* use nullable Int64 and boolean dtypes if available ([#445](googleapis/python-bigquery-pandas#445)) ([89078f8](googleapis/python-bigquery-pandas@89078f8))


### Bug Fixes

* `read_gbq` supports extreme DATETIME values such as `0001-01-01 00:00:00` ([#444](googleapis/python-bigquery-pandas#444)) ([d120f8f](googleapis/python-bigquery-pandas@d120f8f))
* `to_gbq` allows strings for DATE and floats for NUMERIC with `api_method="load_parquet"` ([#423](googleapis/python-bigquery-pandas#423)) ([2180836](googleapis/python-bigquery-pandas@2180836))
* allow extreme DATE values such as `datetime.date(1, 1, 1)` in `load_gbq` ([#442](googleapis/python-bigquery-pandas#442)) ([e13abaf](googleapis/python-bigquery-pandas@e13abaf))
* avoid iteritems deprecation in pandas prerelease ([#469](googleapis/python-bigquery-pandas#469)) ([7379cdc](googleapis/python-bigquery-pandas@7379cdc))
* use data project for destination in `to_gbq` ([#455](googleapis/python-bigquery-pandas#455)) ([891a00c](googleapis/python-bigquery-pandas@891a00c))


### Miscellaneous Chores

* release 0.17.0 ([#470](googleapis/python-bigquery-pandas#470)) ([29ac8c3](googleapis/python-bigquery-pandas@29ac8c3))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/python-bigquery-pandas API. cla: yes This human has signed the Contributor License Agreement. do not merge Indicates a pull request not ready for merge, due to either quality or timing.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ConversionError: Could not convert DataFrame to Parquet. | After upgrate to 0.16.0

3 participants