Skip to content

Conversation

@uros-db
Copy link
Contributor

@uros-db uros-db commented Nov 4, 2025

What changes were proposed in this pull request?

Introduce GeographyType and GeometryType to PySpark Connect. Note that the geospatial data types have already been introduced in PySpark as part of: #52627.

Also, introduce classes to represent a Geography and Geometry value in Python. Note that the corresponding classes have already been introduced on Scala side as part of: #52804.

Why are the changes needed?

Enabling geospatial types in Spark Connect.

Does this PR introduce any user-facing change?

Yes, GeographyType and GeometryType are now available in PySpark Connect.

How was this patch tested?

Added new Python Connect tests:

  • test_parity_geographytype
  • test_parity_geometrytype

Was this patch authored or co-authored using generative AI tooling?

No.

uros-db

This comment was marked as outdated.

@uros-db uros-db changed the title [DRAFT] [SPARK-54176][Geo][PYTHON] Introduce Geography and Geometry data types to PySpark Connect Nov 4, 2025
@uros-db uros-db marked this pull request as ready for review November 5, 2025 07:15
Copy link
Contributor Author

@uros-db uros-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan
Copy link
Contributor

The failure in RpcIntegrationSuite is unrelated, thanks, merging to master/4.1!

@cloud-fan cloud-fan closed this in e7b90e7 Nov 5, 2025
cloud-fan pushed a commit that referenced this pull request Nov 5, 2025
…s to PySpark Connect

### What changes were proposed in this pull request?
Introduce `GeographyType` and `GeometryType` to PySpark Connect. Note that the geospatial data types have already been introduced in PySpark as part of: #52627.

Also, introduce classes to represent a `Geography` and `Geometry` value in Python. Note that the corresponding classes have already been introduced on Scala side as part of: #52804.

### Why are the changes needed?
Enabling geospatial types in Spark Connect.

### Does this PR introduce _any_ user-facing change?
Yes, `GeographyType` and `GeometryType` are now available in PySpark Connect.

### How was this patch tested?
Added new Python Connect tests:
- `test_parity_geographytype`
- `test_parity_geometrytype`

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #52871 from uros-db/geo-spark-connect.

Authored-by: Uros Bojanic <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
…s to PySpark Connect

### What changes were proposed in this pull request?
Introduce `GeographyType` and `GeometryType` to PySpark Connect. Note that the geospatial data types have already been introduced in PySpark as part of: apache#52627.

Also, introduce classes to represent a `Geography` and `Geometry` value in Python. Note that the corresponding classes have already been introduced on Scala side as part of: apache#52804.

### Why are the changes needed?
Enabling geospatial types in Spark Connect.

### Does this PR introduce _any_ user-facing change?
Yes, `GeographyType` and `GeometryType` are now available in PySpark Connect.

### How was this patch tested?
Added new Python Connect tests:
- `test_parity_geographytype`
- `test_parity_geometrytype`

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#52871 from uros-db/geo-spark-connect.

Authored-by: Uros Bojanic <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
huangxiaopingRD pushed a commit to huangxiaopingRD/spark that referenced this pull request Nov 25, 2025
…s to PySpark Connect

### What changes were proposed in this pull request?
Introduce `GeographyType` and `GeometryType` to PySpark Connect. Note that the geospatial data types have already been introduced in PySpark as part of: apache#52627.

Also, introduce classes to represent a `Geography` and `Geometry` value in Python. Note that the corresponding classes have already been introduced on Scala side as part of: apache#52804.

### Why are the changes needed?
Enabling geospatial types in Spark Connect.

### Does this PR introduce _any_ user-facing change?
Yes, `GeographyType` and `GeometryType` are now available in PySpark Connect.

### How was this patch tested?
Added new Python Connect tests:
- `test_parity_geographytype`
- `test_parity_geometrytype`

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#52871 from uros-db/geo-spark-connect.

Authored-by: Uros Bojanic <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants