Skip to content

Conversation

@goungoun
Copy link
Contributor

@goungoun goungoun commented Oct 9, 2025

What changes were proposed in this pull request?

This PR improves the formatting for the variable-length and undirected motif pattern.

  1. Add informative columns: _hop, _pattern, _direction
  2. Fix the first column name change for a variable-length pattern
    The first edge column name for the fixed pattern (u)-[e*1]-(v) and the variable length pattern should be _e1, not e, so that it matches with the other column.
image

Case 1: undirected pattern
g.find("(u)-[]-(v)").where("u.id == 0").show()

Before:
It took me quite some time to understand why it returns two rows, not three rows. The bidirectional edge returns one with the anonymous edge pattern.

+---------+---------+
|        u|        v|  
+---------+---------+
|{0, a, f}|{1, b, m}|
|{0, a, f}|{2, c, m}|
+---------+---------+

After:
It returns three rows with the information, two in edges and one out edge.

+---------+---------+-----------+----------+
|        u|        v|   _pattern|_direction|
+---------+---------+-----------+----------+
|{0, a, f}|{1, b, m}|(u)<-[]-(v)|        in|
|{0, a, f}|{2, c, m}|(u)<-[]-(v)|        in|
|{0, a, f}|{1, b, m}|(u)-[]->(v)|       out|
+---------+---------+-----------+----------+

Case 2: undirected variable-length pattern
g.find("(v)-[e*1..3]-(u)").where("u.id == 2").show()

Before:
Cannot understand why e column is placed at the end of the data frame, and not easy to understand the result.

+---------+---------------+---------+--------------+---------+--------------+---------+---------------+
|        u|            _e1|      _v1|           _e2|      _v2|           _e3|        v|              e|
+---------+---------------+---------+--------------+---------+--------------+---------+---------------+
|{2, c, m}|{2, 0, unknown}|{0, a, f}|{0, 1, friend}|{1, b, m}|{1, 2, friend}|{2, c, m}|           NULL|
|{2, c, m}|{2, 0, unknown}|{0, a, f}|{0, 1, friend}|{1, b, m}|{1, 0, follow}|{0, a, f}|           NULL|
|{2, c, m}|{2, 0, unknown}|{0, a, f}|{0, 1, friend}|     NULL|          NULL|{1, b, m}|           NULL|
|{2, c, m}|           NULL|     NULL|          NULL|     NULL|          NULL|{3, d, f}| {2, 3, follow}|
|{2, c, m}|           NULL|     NULL|          NULL|     NULL|          NULL|{0, a, f}|{2, 0, unknown}|
|{2, c, m}| {1, 0, follow}|{0, a, f}|{0, 1, friend}|{1, b, m}|{1, 2, friend}|{1, b, m}|           NULL|
|{2, c, m}|{2, 0, unknown}|{0, a, f}|{0, 1, friend}|{1, b, m}|{1, 2, friend}|{2, c, m}|           NULL|
|{2, c, m}| {0, 1, friend}|{1, b, m}|{1, 2, friend}|     NULL|          NULL|{0, a, f}|           NULL|
|{2, c, m}|           NULL|     NULL|          NULL|     NULL|          NULL|{1, b, m}| {1, 2, friend}|
+---------+---------------+---------+--------------+---------+--------------+---------+---------------+

After:
The informative columns such as _hop, _pattern, _direction is added to help understand the results. It also ordered by _hop, _direction.

+---------+---------------+---------+--------------+---------+--------------+---------+----+--------------+----------+
|        u|            _e1|      _v1|           _e2|      _v2|           _e3|        v|_hop|      _pattern|_direction|
+---------+---------------+---------+--------------+---------+--------------+---------+----+--------------+----------+
|{2, c, m}| {1, 2, friend}|     NULL|          NULL|     NULL|          NULL|{1, b, m}|   1|(u)<-[e*1]-(v)|        in|
|{2, c, m}| {2, 3, follow}|     NULL|          NULL|     NULL|          NULL|{3, d, f}|   1|(u)-[e*1]->(v)|       out|
|{2, c, m}|{2, 0, unknown}|     NULL|          NULL|     NULL|          NULL|{0, a, f}|   1|(u)-[e*1]->(v)|       out|
|{2, c, m}| {0, 1, friend}|{1, b, m}|{1, 2, friend}|     NULL|          NULL|{0, a, f}|   2|(u)<-[e*2]-(v)|        in|
|{2, c, m}|{2, 0, unknown}|{0, a, f}|{0, 1, friend}|     NULL|          NULL|{1, b, m}|   2|(u)-[e*2]->(v)|       out|
|{2, c, m}| {1, 0, follow}|{0, a, f}|{0, 1, friend}|{1, b, m}|{1, 2, friend}|{1, b, m}|   3|(u)<-[e*3]-(v)|        in|
|{2, c, m}|{2, 0, unknown}|{0, a, f}|{0, 1, friend}|{1, b, m}|{1, 2, friend}|{2, c, m}|   3|(u)<-[e*3]-(v)|        in|
|{2, c, m}|{2, 0, unknown}|{0, a, f}|{0, 1, friend}|{1, b, m}|{1, 2, friend}|{2, c, m}|   3|(u)-[e*3]->(v)|       out|
|{2, c, m}|{2, 0, unknown}|{0, a, f}|{0, 1, friend}|{1, b, m}|{1, 0, follow}|{0, a, f}|   3|(u)-[e*3]->(v)|       out|
+---------+---------------+---------+--------------+---------+--------------+---------+----+--------------+----------+

@goungoun goungoun changed the title Improve result formatting for variable length pattern and undirected pattern [WIP] Improve result formatting for variable length pattern and undirected pattern Oct 9, 2025
@goungoun goungoun changed the title [WIP] Improve result formatting for variable length pattern and undirected pattern Improve result formatting for variable length pattern and undirected pattern Oct 9, 2025
@goungoun goungoun marked this pull request as ready for review October 9, 2025 12:20
@goungoun
Copy link
Contributor Author

goungoun commented Oct 9, 2025

Now, it make sense to me. Please merge this improvement before releasing the new version.
@SemyonSinchenko and @rjurney, I am waiting for your review. :)

Copy link
Collaborator

@SemyonSinchenko SemyonSinchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @goungoun

@goungoun
Copy link
Contributor Author

While checking the examples in the manual, I found a bug and fixed. The named edge pattern (u)<-[e]->(v) should be transformed to (u)-[e1]->(v);(v)-[e2]->(u). (u)-[e]->(v);(v)-[e]->(u) is wrong.

@goungoun
Copy link
Contributor Author

@SemyonSinchenko, I pushed the document changes, 04-motif-finding.md.

Copy link
Collaborator

@SemyonSinchenko SemyonSinchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks @goungoun

@SemyonSinchenko SemyonSinchenko merged commit f5de10f into graphframes:main Oct 12, 2025
5 checks passed
@goungoun
Copy link
Contributor Author

Thank you, @SemyonSinchenko.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants