Skip to content

Conversation

@pastanton
Copy link
Contributor

@pastanton pastanton commented Jul 15, 2022

When doing the type check in SqlToS3Operator, convert dataframe columns object to str.
This avoids errors when converting from df to parquet, allowing for this simple SQL->S3 operator to be used with most queries - without having to modify the SQL to avoid types pandas doesn't by like by default.

@boring-cyborg boring-cyborg bot added area:providers provider:amazon AWS/Amazon - related issues labels Jul 15, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 15, 2022

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

@pastanton pastanton changed the title Update sql_to_s3.py For SqlToS3Operator, change column type from object to str in dataframe Jul 15, 2022
@pastanton pastanton changed the title For SqlToS3Operator, change column type from object to str in dataframe SqlToS3Operator: change column type from object to str in dataframe Jul 15, 2022
@pastanton pastanton marked this pull request as draft July 16, 2022 21:00
@pastanton
Copy link
Contributor Author

Some tests failed due to my renaming the private method. That should be fixed now.

@pastanton pastanton marked this pull request as ready for review July 16, 2022 23:30
@pastanton pastanton requested a review from potiuk July 17, 2022 00:02
Convert dataframe object columns to str, to avoid errors when converting from df to parquet.

Renamed methods to remove old name:
_fix_int_dtypes -> _fix_dtypes
test_fix_int_dtypes -> test_fix_dtypes
@pastanton
Copy link
Contributor Author

Shortened character length of a comment to comply with PEP8.

@potiuk potiuk merged commit 693fe60 into apache:main Jul 18, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 18, 2022

Awesome work, congrats on your first merged pull request!

@potiuk
Copy link
Member

potiuk commented Jul 18, 2022

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants