Skip to content

feat(compiler): Parse and recover on incomplete opening HTML tags#38681

Closed
ayazhafiz wants to merge 4 commits intoangular:masterfrom
ayazhafiz:i/38596-el-tag
Closed

feat(compiler): Parse and recover on incomplete opening HTML tags#38681
ayazhafiz wants to merge 4 commits intoangular:masterfrom
ayazhafiz:i/38596-el-tag

Conversation

@ayazhafiz
Copy link
Copy Markdown
Contributor

@ayazhafiz ayazhafiz commented Sep 2, 2020

Let's say we have a code like

<div<span>123</span>

Currently this gets parsed into a tree with the element tag div<span.
This has at least two downsides:

  • An incorrect diagnostic that </span> doesn't close an element is
    emitted.
  • A consumer of the parse tree using it for editor services is unable to
    provide correct completions for the opening <span> tag.

This patch attempts to fix both issues by instead parsing the code into
the same tree that would be parsed for <div></div><span>123</span>.

In particular, we do this by optimistically scanning an open tag as
usual, but if we do not notice a terminating '>', we mark the tag as
"incomplete". A parser then emits an error for the incomplete tag and
adds a synthetic (recovered) element node to the tree with the
incomplete open tag's name.

What's the downside of this? For one, a breaking change.

  1. The first breaking change is that < symbols that are ambiguously text
    or opening tags will be parsed as opening tags instead of text in
    element bodies. Take the code

    <p>a<b</p>

    Clearly we cannot have the best of both worlds, and this patch chooses
    to swap the parsing strategy to support the new feature. Of course, <
    can still be inserted as text via the &lt; entity.

Part of #38596

PR Checklist

Please check if your PR fulfills the following requirements:

PR Type

What kind of change does this PR introduce?

  • Feature

Does this PR introduce a breaking change?

  • Yes

@pullapprove pullapprove Bot requested a review from JoostK September 2, 2020 15:18
@ayazhafiz ayazhafiz added area: compiler Issues related to `ngc`, Angular's template compiler target: major This PR is targeted for the next major release labels Sep 2, 2020
@ngbot ngbot Bot modified the milestone: needsTriage Sep 2, 2020
Comment thread packages/compiler/src/ml_parser/lexer.ts Outdated
Comment thread packages/compiler/src/ml_parser/lexer.ts Outdated
Comment thread packages/compiler/src/ml_parser/lexer.ts Outdated
Comment thread packages/compiler/src/ml_parser/parser.ts Outdated
Comment thread packages/compiler/test/ml_parser/lexer_spec.ts Outdated
Comment thread packages/compiler/test/ml_parser/lexer_spec.ts Outdated
@ayazhafiz ayazhafiz force-pushed the i/38596-el-tag branch 2 times, most recently from 8668491 to e68d8c3 Compare September 4, 2020 02:48
@pullapprove pullapprove Bot requested a review from JoostK September 10, 2020 21:31
@ayazhafiz ayazhafiz requested a review from alxhub September 10, 2020 21:35
Copy link
Copy Markdown
Member

@JoostK JoostK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor things. In general I think this is a high-risk change, so a global presubmit may be in order.

Comment thread packages/compiler/src/ml_parser/parser.ts Outdated
Comment thread packages/compiler/src/ml_parser/parser.ts Outdated
Comment thread packages/compiler/src/ml_parser/parser.ts Outdated
@ayazhafiz ayazhafiz requested a review from JoostK September 12, 2020 14:26
Let's say we have a code like

```html
<div<span>123</span>
```

Currently this gets parsed into a tree with the element tag `div<span`.
This has at least two downsides:

- An incorrect diagnostic that `</span>` doesn't close an element is
  emitted.
- A consumer of the parse tree using it for editor services is unable to
  provide correct completions for the opening `<span>` tag.

This patch attempts to fix both issues by instead parsing the code into
the same tree that would be parsed for `<div></div><span>123</span>`.

In particular, we do this by optimistically scanning an open tag as
usual, but if we do not notice a terminating '>', we mark the tag as
"incomplete". A parser then emits an error for the incomplete tag and
adds a synthetic (recovered) element node to the tree with the
incomplete open tag's name.

What's the downside of this? For one, a breaking change.

<ol>
<li>

The first breaking change is that `<` symbols that are ambiguously text
or opening tags will be parsed as opening tags instead of text in
element bodies. Take the code

```html
<p>a<b</p>
```

Clearly we cannot have the best of both worlds, and this patch chooses
to swap the parsing strategy to support the new feature. Of course, `<`
can still be inserted as text via the `&lt;` entity.

</li>
</ol>

Part of angular#38596
@kyliau
Copy link
Copy Markdown
Contributor

kyliau commented Sep 17, 2020

@ayazhafiz ayazhafiz removed the request for review from alxhub September 21, 2020 19:19
@ayazhafiz ayazhafiz added the action: merge The PR is ready for merge by the caretaker label Sep 21, 2020
@mhevery mhevery closed this in 6ae3b68 Sep 21, 2020
@angular-automatic-lock-bot
Copy link
Copy Markdown

This issue has been automatically locked due to inactivity.
Please file a new issue if you are encountering a similar or related problem.

Read more about our automatic conversation locking policy.

This action has been performed automatically by a bot.

@angular-automatic-lock-bot angular-automatic-lock-bot Bot locked and limited conversation to collaborators Oct 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

action: merge The PR is ready for merge by the caretaker area: compiler Issues related to `ngc`, Angular's template compiler cla: yes target: major This PR is targeted for the next major release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants