Skip to content

Comments

feat: Introduce compatibility mode & fix feature whitespace#349

Merged
acoulton merged 3 commits intoBehat:masterfrom
acoulton:feature-description-whitespace-parity
May 26, 2025
Merged

feat: Introduce compatibility mode & fix feature whitespace#349
acoulton merged 3 commits intoBehat:masterfrom
acoulton:feature-description-whitespace-parity

Conversation

@acoulton
Copy link
Contributor

@acoulton acoulton commented May 23, 2025

This is the beginning of a concept for providing a "parser mode" to allow users to opt-in to newer parser behaviour that is compatible with cucumber/gherkin but may not be 100% BC. We will in due course expose this as a Behat configuration option.

I have started by using the flag to fix #209 - subsequent PRs will then tackle the other remaining cucumber-parity issues.

Very open to feedback on both the concept and naming of things!

@codecov
Copy link

codecov bot commented May 23, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.82%. Comparing base (4012f26) to head (2a921a4).
Report is 6 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master     #349      +/-   ##
============================================
+ Coverage     96.79%   96.82%   +0.02%     
- Complexity      575      581       +6     
============================================
  Files            36       37       +1     
  Lines          1716     1730      +14     
============================================
+ Hits           1661     1675      +14     
  Misses           55       55              
Flag Coverage Δ
php8.1 96.82% <100.00%> (+0.02%) ⬆️
php8.1--with=symfony/yaml:^5.4 96.82% <100.00%> (+0.02%) ⬆️
php8.1--with=symfony/yaml:^6.4 96.82% <100.00%> (+0.02%) ⬆️
php8.2 96.82% <100.00%> (+0.02%) ⬆️
php8.3 96.82% <100.00%> (+0.02%) ⬆️
php8.4 96.82% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

{
case LEGACY = 'legacy';

case GHERKIN_32 = 'gherkin-32';
Copy link
Contributor Author

@acoulton acoulton May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tied this to a gherkin version, because we may in future want to use similar logic to support new gherkin majors.

The downside is that unless we solve all the cucumber parity issues in the first release, the behaviour in gherkin-32 mode will change in subsequent releases. That may be acceptable if we advertise to users that the gherkin-32 mode is expected to be unstable for a few releases - they can always stick with the default legacy mode.

I don't want to expose individual feature flags to end users as that will hugely complicate our ability to test / maintain the parser in different configurations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a note that behaviour will change.

'spaces_in_language.feature' => 'Whitespace not supported around language selector',
'rule_without_name_and_description.feature' => 'Rule is wrongly parsed as Description',
'incomplete_background_2.feature' => 'Background descriptions not supported',
'legacy' => [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to directly use the enum case values as the array keys, but they're not supported in constant expressions in PHP 8.1.

Comment on lines +241 to +244
$description .= $node;
if ($node !== "\n") {
// Text nodes do not end with a newline, add one. The final trailing newline is rtrimmed below.
$description .= "\n";
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little convoluted because for a feature file like:

Feature: Extra table content
  Tables are delimited by pipes on both sides.
  Anything that isn't enclosed is not part of
  the table.

  It is not recommended to use this feature, but
  it is how the implementation currently works.

The Lexer gives us the following tokens (note that the empty line is represented by "\n", slightly inconsistently to the lines with text).

"  Tables are delimited by pipes on both sides."
"  Anything that isn't enclosed is not part of"
"  the table."
"\n"
"  It is not recommended to use this feature, but"
"  it is how the implementation currently works."
"\n"

@acoulton acoulton force-pushed the feature-description-whitespace-parity branch from 94a1a6a to c5bd8b5 Compare May 23, 2025 11:38
@acoulton acoulton requested review from carlos-granados and stof May 23, 2025 11:38
Copy link
Member

@stof stof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the compatibility be exposed (through an @internal getter) in the FeatureNode, to all checking it in the TagFilter for instance when solving #336 ?


case GHERKIN_32 = 'gherkin-32';

public function shouldRemoveFeatureDescriptionPadding(): bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we mark all those methods as internal ?

To me, the only public API of the enum should be the list of cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion.

I'd thought of leaving it public as it may be useful e.g. for Behat to know whether padding was removed or not (to then know whether to add it back).

But actually I think we should try to solve any of these downstream without reference to the parsing mode (e.g. we can just look at whether there is / is not whitespace at the start of the first line).

Then it will be easier to remove these if & when they're no longer required.

Copy link
Contributor

@carlos-granados carlos-granados left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acoulton I think this is a good idea in the right direction, just added a small comment

continue;
}

if (is_string($node)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code inside this if statement only runs if we are not in legacy compatibility mode but I had to think twice to realize this. I think the logic would be clearer if we did:

if (is_string($node)) {
    if ($this->compatibilityMode->shouldRemoveFeatureDescriptionPadding()) {
        ...
        continue;
   }     
   if ($node === "\n" && $description === null) {
       ...
   }
   ....
   continue;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - I wondered about that but leaned towards making it obvious that the legacy implementation hadn't changed by avoiding any diff to those lines.

However I think you're right, it's better to be clear about the conditions. Updated.

acoulton added 3 commits May 24, 2025 21:36
We will need to vary the logic for that depending on the compatibility
mode - a custom comparator will make that easier.
The GherkinCompatibilityMode enum is a feature flag that will allow us
to change / introduce new Parser behaviour without breaking BC for
end-users.

This commit just introduces the concept by adding the (default) "legacy"
mode and refactoring the CompatibilityTest so that:

* it runs each example for all defined GherkinCompatibilityMode values
* known issues are defined separately for each parsing mode, so that we
  can continue to test the legacy behaviour alongside the newer mode(s).
The `gherkin-32` compatibility mode will eventually include all the
current known cucumber parity issues.

To begin, fix the parsing of whitespace in feature descriptions when
parsing in gherkin-32 mode.
@acoulton acoulton force-pushed the feature-description-whitespace-parity branch from c5bd8b5 to 2a921a4 Compare May 24, 2025 21:01
@acoulton
Copy link
Contributor Author

should the compatibility be exposed (through an @internal getter) in the FeatureNode, to all checking it in the TagFilter for instance when solving #336 ?

That's a good idea, it could be useful - but we might be able to solve things other ways (e.g. for that one just by looking for the @ or not at the start of the tag string).

I think I'd prefer to wait until we definitely need it, rather than adding it now?

Copy link
Contributor

@carlos-granados carlos-granados left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good for me now, thanks @acoulton

@acoulton acoulton merged commit fad95fa into Behat:master May 26, 2025
10 checks passed
@acoulton acoulton deleted the feature-description-whitespace-parity branch May 26, 2025 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature description whitespace does not match Cucumber

3 participants