Skip to content

Conversation

@jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Jan 9, 2024

Rationale for this change

Removing usage of np.core, as that is deprecated and will be removed in numpy 2.0.

For this specific case, we can just hardcode the list of data types instead of using a numpy api (this list doesn't typically change).

@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Jan 9, 2024
@github-actions
Copy link

github-actions bot commented Jan 9, 2024

⚠️ GitHub issue #39533 has been automatically assigned in GitHub to PR creator.

@jorisvandenbossche
Copy link
Member Author

@github-actions crossbow submit pandas

@github-actions
Copy link

github-actions bot commented Jan 9, 2024

Revision: 4e403be

Submitted crossbow builds: ursacomputing/crossbow @ actions-fbef62db1c

Task Status
test-conda-python-3.10-pandas-latest GitHub Actions
test-conda-python-3.10-pandas-nightly GitHub Actions
test-conda-python-3.11-pandas-upstream_devel GitHub Actions
test-conda-python-3.8-pandas-1.0 GitHub Actions
test-conda-python-3.9-pandas-latest GitHub Actions

['object', 'bool'])
"int8", "int16", "int32", "int64",
"uint8", "uint16", "uint32", "uint64",
"float16", "float32", "float64", "float128",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually support conversion to/from float128?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point ;) No, I don't think we even have float128 in the arrow spec, right? I just hardcoded the current dynamic content, but that can indeed be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought: those types are not arrow types, but the numpy dtype stored in the pandas metadata, i.e. the original dtype of the pandas DataFrame column that was converted to a pyarrow.Table.

So in theory you can have a pandas DataFrame with a float128 columns, and get that in the metadata (and then having that included in the list above is fine). Now, this is currently also not possible, as we haven't implemented the conversion of numpy float128 to a pyarrow float array, and thus the conversion of such a DataFrame currently fails.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Jan 9, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Jan 9, 2024
@jorisvandenbossche jorisvandenbossche merged commit 72ed584 into apache:main Jan 10, 2024
@jorisvandenbossche jorisvandenbossche removed the awaiting changes Awaiting changes label Jan 10, 2024
@jorisvandenbossche jorisvandenbossche deleted the gh-39533-np-core branch January 10, 2024 08:13
raulcd pushed a commit that referenced this pull request Jan 10, 2024
### Rationale for this change

Removing usage of `np.core`, as that is deprecated and will be removed in numpy 2.0. 

For this specific case, we can just hardcode the list of data types instead of using a numpy api (this list doesn't typically change).

* Closes: #39533

Authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 72ed584.

There were 5 benchmark results indicating a performance regression:

The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them.

dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…pache#39535)

### Rationale for this change

Removing usage of `np.core`, as that is deprecated and will be removed in numpy 2.0. 

For this specific case, we can just hardcode the list of data types instead of using a numpy api (this list doesn't typically change).

* Closes: apache#39533

Authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Python] NumPy 2.0 compat: remove usage of np.core

2 participants