Skip to content

feat(har): record remote IP:PORT and SSL details#6631

Merged
mxschmitt merged 2 commits intomicrosoft:masterfrom
rwoll:feature/har-ip-and-ssl
Jun 15, 2021
Merged

feat(har): record remote IP:PORT and SSL details#6631
mxschmitt merged 2 commits intomicrosoft:masterfrom
rwoll:feature/har-ip-and-ssl

Conversation

@rwoll
Copy link
Member

@rwoll rwoll commented May 18, 2021

Updates in the HAR capture Log Entries:

  • serverIPAddress: (string) This now gets populated.

  • _serverPort: (number) This now gets populated. There isn't an
    official field according to this w3 doc (and I didn't see
    a suitable field in a HAR download from Chromium DevTools).

    I used the field name from WK WebInspector HARBuilder.js.

  • _securityDetails: non-standard info that is helpful in making
    assertions about cert expiries.

This also introduces new APIs on network.Response exposing the
new info.

NB: At first these APIs were synchronous and populated with info that is
readily available at the beginning of the Response creation, but WK does
not emit IP address info until _loadingFinished, so the API's were all
turned into Promise-based approaches.

Closes #6624.

Copy link
Member Author

@rwoll rwoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @JoelEinbinder 🎉

@rwoll rwoll marked this pull request as ready for review May 19, 2021 00:44
@rwoll
Copy link
Member Author

rwoll commented May 19, 2021

@JoelEinbinder I cleaned up the types (including deleting some of the awkward code dealing with the non-standard zero values.) I also added a comment to the WebKit address parsing code with some examples from my testing.

I haven't yet figured out the appropriate changes to emit the Response at the correct time after _onLoadingFinished where the IP address details are available for WebKit so I left the signal barrier in place in the HAR code. I'll remove that once we sort out dealing Response emission timing (directly or indirectly through guarding the newly added fields behind a Promise api). If we can avoid adding more external Promises to Response, I agree that would be much nicer.

@rwoll rwoll force-pushed the feature/har-ip-and-ssl branch 2 times, most recently from 21c91c9 to 9ce3a01 Compare May 20, 2021 21:13
@rwoll
Copy link
Member Author

rwoll commented May 20, 2021

@JoelEinbinder Here's the latest pass! Promise-based APIs because of WK. See #6631 (comment) with updated description.

@rwoll rwoll force-pushed the feature/har-ip-and-ssl branch 3 times, most recently from d2d1c15 to c7f30ee Compare May 21, 2021 00:31
@mxschmitt mxschmitt added the CQ1 label May 21, 2021
@rwoll rwoll force-pushed the feature/har-ip-and-ssl branch 2 times, most recently from 38e1f04 to b0342f5 Compare May 21, 2021 07:14
@mxschmitt mxschmitt added CQ1 and removed CQ1 labels May 21, 2021
@rwoll
Copy link
Member Author

rwoll commented May 21, 2021

@JoelEinbinder: This should now pass on all the browser+platform variants. (Thanks platform-dependent WK networking! 😉 )

Let me know your feedback and I will make any necessary adjustments.

It's also worth skimming through #6695 to what additional information will be coming in next.

@mxschmitt mxschmitt added CQ1 and removed CQ1 labels May 23, 2021
gnomesysadmins pushed a commit to GNOME/glib that referenced this pull request Jun 1, 2021
This changeset exposes

* `not-valid-before`
* `not-valid-after`
* `subject-name`
* `issuer-name`

on GTlsCertificate provided by the underlying TLS Backend.

In order to make use of these changes,
see the related [glib-networking MR][glib-networking].

This change aims to help populate more of the [`Certificate`][wk-cert]
info in the WebKit Inspector Protocol on Linux.

This changeset stems from work in Microsoft Playwright to [add more info
into its HAR capture][pw] generated from the Inspector Protocol events
and will bring feature parity across WebKit platforms.

[wk-cert]: https://github.com/WebKit/WebKit/blob/8afe31a018b11741abdf9b4d5bb973d7c1d9ff05/Source/JavaScriptCore/inspector/protocol/Security.json
[pw]: microsoft/playwright#6631
[glib-networking]: https://gitlab.gnome.org/GNOME/glib-networking/-/merge_requests/156
@rwoll
Copy link
Member Author

rwoll commented Jun 2, 2021

@JoelEinbinder I took a quick pass to refactor to defer populating the information across all browser impls in _onRequestFinished. Please let me know if the general approach looks good. (i.e. save off intermediate request data in a map until we need it later in request finished.) If all looks kosher, I will clean up (i.e. ensure we're deleting out of the map not to leak/bloat mem; additionally, we likely need to do something in some of the other terminal request handlers (i.e. loading failed)—I'll add some more test cases, too.)

Unfortunately, this brings me back to why I ended up implementing this and then needing to change to the hybrid approach you saw in the last review: should include postData and should include binary postData now hang because we never fire request finished. Do you have tips to ensure we don't fail on these? (relates: #6631 (comment))

requestStart: timingPayload ? wkMillisToRoundishMillis(timingPayload.requestStart) : -1,
responseStart: timingPayload ? wkMillisToRoundishMillis(timingPayload.responseStart) : -1,
};

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a number of errant extra newlines that I'll clean up. They are an artifact of refactoring within this MR.

@rwoll
Copy link
Member Author

rwoll commented Jun 2, 2021

@JoelEinbinder I took a quick pass to refactor to defer populating the information across all browser impls in _onRequestFinished. Please let me know if the general approach looks good. (i.e. save off intermediate request data in a map until we need it later in request finished.) If all looks kosher, I will clean up (i.e. ensure we're deleting out of the map not to leak/bloat mem; additionally, we likely need to do something in some of the other terminal request handlers (i.e. loading failed)—I'll add some more test cases, too.)

Unfortunately, this brings me back to why I ended up implementing this and then needing to change to the hybrid approach you saw in the last review: should include postData and should include binary postData now hang because we never fire request finished. Do you have tips to ensure we don't fail on these? (relates: #6631 (comment))

I suppose one approach would be to add response._connectionDetailsFinished(serverIPAddressAndPort, securityDetails) that FF and CR can call immediately after createResponse, and WK can call this in _requestFinished(…).

@rwoll
Copy link
Member Author

rwoll commented Jun 2, 2021

Unfortunately, this brings me back to why I ended up implementing this and then needing to change to the hybrid approach you saw in the last review: should include postData and should include binary postData now hang because we never fire request finished. Do you have tips to ensure we don't fail on these? (relates: #6631 (comment))

I suppose one approach would be to add response._connectionDetailsFinished(serverIPAddressAndPort, securityDetails) that FF and CR can call immediately after createResponse, and WK can call this [before] _requestFinished(…).

Here's what that might look like: rwoll#2. @JoelEinbinder Let me know which approach you think is cleaner:

  1. Everything in loading finished modulo fixing Chromium breakage of the postData test cases: e29de4c
  2. Using callbacks like WIP: feat(har): experiment with finish callbacks rwoll/playwright#2 which are invoked next to createResponse for CR/FF and then _loadingFinished for WK).

I like doing everything in request finished for simplicity/unity, but I'm not sure how to compromise between Chrome and WebKit. I don't love rwoll#2 since—while it's simple for Chrome and FF—you can accidentally hang the HAR capture in WK if the finish callbacks aren't invoked in all terminal response paths (although I've added tests to make sure we are good here).

@rwoll
Copy link
Member Author

rwoll commented Jun 10, 2021

Updated with option 2 since fixing Chromium not firing loading finished is out of scope.

}

_serverIPAddressAndPortFinished(serverIPAddress?: IPAddressAndPort) {
this._serverIPAddressAndPortPromiseCallback(helper.filterEmpties(serverIPAddress));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need filterEmpties here and below. Everything is eventually going to be JSON.stringified into the har.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this, but we will get some empty objects for securityDetails in the HAR. (Tests updated accordingly.)

Copy link
Contributor

@mxschmitt mxschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! (don't forget to rebase)

Updates in the HAR capture Log Entries:

* `serverIPAddress`: (string) This now gets populated.
* `_serverPort`: (number) This now gets populated. There isn't an
  official field according to this [w3 doc][w3] (and I didn't see
  a suitable field in a HAR download from Chromium DevTools).

  I used the field name from [WK WebInspector HARBuilder.js][wk].
* _securityDetails: non-standard info that is helpful in making
  assertions about cert expiries.

This also introduces new APIs on `network.Response` exposing the
new info.

NB: At first these APIs were synchonous and populated with info that is
readily available at the beginning of the Response creation, but WK does
not emit IP address info until _loadingFinished, so the API's were all
turned into Promise-based approaches.

Closes microsoft#6624.

[w3]: https://www.w3.org/community/bigdata-tools/files/2017/10/HAR_Spec_TO_HAR_Vocabulary.pdf
[wk]: https://opensource.apple.com/source/WebInspectorUI/WebInspectorUI-7608.4.9.1.3/UserInterface/Controllers/HARBuilder.js.auto.html
@rwoll rwoll force-pushed the feature/har-ip-and-ssl branch 2 times, most recently from 17eca1b to 2de1397 Compare June 14, 2021 18:58
@rwoll
Copy link
Member Author

rwoll commented Jun 14, 2021

All feedback addressed! Thanks for the reviews @JoelEinbinder and @mxschmitt!

@mxschmitt mxschmitt added CQ1 and removed CQ1 labels Jun 14, 2021
@rwoll rwoll force-pushed the feature/har-ip-and-ssl branch from 2de1397 to 782de28 Compare June 14, 2021 19:45
@mxschmitt mxschmitt added CQ1 and removed CQ1 labels Jun 14, 2021
expect(port).toBe(httpsServer.PORT);
if (browserName === 'webkit' && platform === 'win32')
expect(securityDetails).toEqual({subjectName: 'puppeteer-tests', validFrom: 1550084863});
expect(securityDetails).toEqual({subjectName: 'puppeteer-tests', validFrom: 1550084863, validTo: -1});
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the best approach is here…should I just mark this test as fully failing on Win32? filterEmpties used to filter out the -1's we were getting from WK.

@mxschmitt mxschmitt added CQ1 and removed CQ1 labels Jun 14, 2021
@mxschmitt mxschmitt merged commit 195eab8 into microsoft:master Jun 15, 2021
rwoll added a commit to rwoll/playwright that referenced this pull request Jun 15, 2021
This allows users to access the connection information outside of the
HAR file in realtime.

Closes microsoft#7147.
Relates microsoft#6631.
rwoll added a commit to rwoll/playwright that referenced this pull request Jun 15, 2021
This allows users to access the connection information outside of the
HAR file in realtime.

Closes microsoft#7147.
Relates microsoft#6631.
rwoll added a commit to rwoll/playwright that referenced this pull request Jun 17, 2021
This allows users to access the connection information outside of the
HAR file in realtime.

Closes microsoft#7147.
Relates microsoft#6631.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] HAR: IP Address and SSL Info

3 participants