Skip to content

Conversation

@SemyonSinchenko
Copy link
Collaborator

What changes were proposed in this pull request?

Introduced a new configuration option spark.graphframes.useLabelsAsComponents to allow connected components to use labels as components. Updated the logic in ConnectedComponents and added relevant test cases for both enabled and disabled configurations.

Why are the changes needed?

Close #620

Introduced a new configuration option `spark.graphframes.useLabelsAsComponents` to allow connected components to use labels as components. Updated the logic in `ConnectedComponents` and added relevant test cases for both enabled and disabled configurations.
@SemyonSinchenko
Copy link
Collaborator Author

cc: @Kimahriman This PR is aiming to fix the problem you faced in apache/sedona#1919

I added a test for the case of string labels and with setting a conf to false CC returns LongType IDs.

@Kimahriman
Copy link
Contributor

LGTM, maybe just mention the components are longs when the config is false in the description?

@SemyonSinchenko
Copy link
Collaborator Author

LGTM, maybe just mention the components are longs when the config is false in the description?

Done!

@SemyonSinchenko SemyonSinchenko merged commit 90fecf6 into graphframes:master Jul 3, 2025
5 checks passed
@SemyonSinchenko SemyonSinchenko deleted the 620-cc-labels-config branch July 19, 2025 11:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: introduce a config to not overwrite component ID from long to the label

3 participants