feat: Introduce compatibility mode & fix feature whitespace#349
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #349 +/- ##
============================================
+ Coverage 96.79% 96.82% +0.02%
- Complexity 575 581 +6
============================================
Files 36 37 +1
Lines 1716 1730 +14
============================================
+ Hits 1661 1675 +14
Misses 55 55
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
| { | ||
| case LEGACY = 'legacy'; | ||
|
|
||
| case GHERKIN_32 = 'gherkin-32'; |
There was a problem hiding this comment.
I have tied this to a gherkin version, because we may in future want to use similar logic to support new gherkin majors.
The downside is that unless we solve all the cucumber parity issues in the first release, the behaviour in gherkin-32 mode will change in subsequent releases. That may be acceptable if we advertise to users that the gherkin-32 mode is expected to be unstable for a few releases - they can always stick with the default legacy mode.
I don't want to expose individual feature flags to end users as that will hugely complicate our ability to test / maintain the parser in different configurations.
There was a problem hiding this comment.
I've added a note that behaviour will change.
| 'spaces_in_language.feature' => 'Whitespace not supported around language selector', | ||
| 'rule_without_name_and_description.feature' => 'Rule is wrongly parsed as Description', | ||
| 'incomplete_background_2.feature' => 'Background descriptions not supported', | ||
| 'legacy' => [ |
There was a problem hiding this comment.
It would be nice to directly use the enum case values as the array keys, but they're not supported in constant expressions in PHP 8.1.
| $description .= $node; | ||
| if ($node !== "\n") { | ||
| // Text nodes do not end with a newline, add one. The final trailing newline is rtrimmed below. | ||
| $description .= "\n"; | ||
| } |
There was a problem hiding this comment.
This is a little convoluted because for a feature file like:
Feature: Extra table content
Tables are delimited by pipes on both sides.
Anything that isn't enclosed is not part of
the table.
It is not recommended to use this feature, but
it is how the implementation currently works.The Lexer gives us the following tokens (note that the empty line is represented by "\n", slightly inconsistently to the lines with text).
" Tables are delimited by pipes on both sides."
" Anything that isn't enclosed is not part of"
" the table."
"\n"
" It is not recommended to use this feature, but"
" it is how the implementation currently works."
"\n"
94a1a6a to
c5bd8b5
Compare
|
|
||
| case GHERKIN_32 = 'gherkin-32'; | ||
|
|
||
| public function shouldRemoveFeatureDescriptionPadding(): bool |
There was a problem hiding this comment.
should we mark all those methods as internal ?
To me, the only public API of the enum should be the list of cases.
There was a problem hiding this comment.
Good suggestion.
I'd thought of leaving it public as it may be useful e.g. for Behat to know whether padding was removed or not (to then know whether to add it back).
But actually I think we should try to solve any of these downstream without reference to the parsing mode (e.g. we can just look at whether there is / is not whitespace at the start of the first line).
Then it will be easier to remove these if & when they're no longer required.
carlos-granados
left a comment
There was a problem hiding this comment.
@acoulton I think this is a good idea in the right direction, just added a small comment
| continue; | ||
| } | ||
|
|
||
| if (is_string($node)) { |
There was a problem hiding this comment.
The code inside this if statement only runs if we are not in legacy compatibility mode but I had to think twice to realize this. I think the logic would be clearer if we did:
if (is_string($node)) {
if ($this->compatibilityMode->shouldRemoveFeatureDescriptionPadding()) {
...
continue;
}
if ($node === "\n" && $description === null) {
...
}
....
continue;
}
There was a problem hiding this comment.
Thanks - I wondered about that but leaned towards making it obvious that the legacy implementation hadn't changed by avoiding any diff to those lines.
However I think you're right, it's better to be clear about the conditions. Updated.
We will need to vary the logic for that depending on the compatibility mode - a custom comparator will make that easier.
The GherkinCompatibilityMode enum is a feature flag that will allow us to change / introduce new Parser behaviour without breaking BC for end-users. This commit just introduces the concept by adding the (default) "legacy" mode and refactoring the CompatibilityTest so that: * it runs each example for all defined GherkinCompatibilityMode values * known issues are defined separately for each parsing mode, so that we can continue to test the legacy behaviour alongside the newer mode(s).
The `gherkin-32` compatibility mode will eventually include all the current known cucumber parity issues. To begin, fix the parsing of whitespace in feature descriptions when parsing in gherkin-32 mode.
c5bd8b5 to
2a921a4
Compare
That's a good idea, it could be useful - but we might be able to solve things other ways (e.g. for that one just by looking for the I think I'd prefer to wait until we definitely need it, rather than adding it now? |
carlos-granados
left a comment
There was a problem hiding this comment.
Looking good for me now, thanks @acoulton
This is the beginning of a concept for providing a "parser mode" to allow users to opt-in to newer parser behaviour that is compatible with cucumber/gherkin but may not be 100% BC. We will in due course expose this as a Behat configuration option.
I have started by using the flag to fix #209 - subsequent PRs will then tackle the other remaining cucumber-parity issues.
Very open to feedback on both the concept and naming of things!