Skip to content

fix(cypher): variable-length path re-traverses bound relationship from previous MATCH clause #4006

@robfrank

Description

@robfrank

Summary

When a relationship variable r is bound in one MATCH clause and then referenced explicitly inside a path pattern in a subsequent MATCH, variable-length path segments ([*0..1], [*], etc.) within that same path incorrectly allow re-traversal of r. This violates OpenCypher's path isomorphism rule, which applies within a single path, not within a single MATCH clause.

Steps to Reproduce

Create a simple linear chain:

CREATE (n0:Node)-[:EDGE]->(n1:Node)-[:EDGE]->(n2:Node)-[:EDGE]->(n3:Node)

Then query:

MATCH ()-[r:EDGE]-()
MATCH p = (n)-[*0..1]-()-[r]-()-[*0..1]-(m)
RETURN count(p) AS c

Expected: 32
Actual: 52

This is TCK scenario Match4 [7]:

Feature "Match4 - Match patterns with variable length relationships"
Scenario "Matching variable length patterns including a bound relationship"

Root Cause

ExpandPathStep.hasEdgeConflict() (file: engine/src/main/java/com/arcadedb/query/opencypher/executor/steps/ExpandPathStep.java) skips conflict-checking for any variable listed in previousStepVariables - a snapshot of variables bound by prior MATCH clauses:

// Skip variables bound by previous MATCH clauses or WITH: Cypher's relationship
// uniqueness only applies within the current MATCH clause
if (previousStepVariables != null && previousStepVariables.contains(prop))
    continue;

This logic is based on an incorrect rule. OpenCypher relationship uniqueness (path isomorphism) applies within a single path pattern, not within a single MATCH clause. In the query above, r is bound in MATCH 1 but is also an explicit element of path p in MATCH 2. The [*0..1] segments in p must not traverse r again regardless of which MATCH clause bound r.

Expected Behavior (per OpenCypher spec)

A relationship variable that appears explicitly in the same path pattern as a variable-length segment must be excluded from that segment's traversal, even if the variable was introduced in a different MATCH clause.

The distinction is:

  • r bound in MATCH 1, not referenced in the current path - can be reused (existing skip is correct for this case)
  • r bound in MATCH 1, explicitly part of the same path p being expanded - must be treated as a forbidden edge for the [*0..1] traversal

Suggested Fix

The planner/executor needs to distinguish between "same-path co-participants" and "independent prior bindings" when populating the exclusion set passed to the traverser. Relationship variables that appear in the same path pattern as the variable-length segment being expanded should always be excluded from that traversal, regardless of their MATCH clause of origin.

Impact

Affects queries that:

  1. Bind a relationship in one MATCH clause
  2. Reference that same relationship inside a path pattern with variable-length segments in a subsequent MATCH clause

Results are over-counted because extra paths are generated by re-traversing the bound relationship.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions