Skip to content

Conversation

@derekblank
Copy link
Contributor

@derekblank derekblank commented May 2, 2024

Resolves Automattic/dotcom-forge#6701

Proposed Changes

Prevents local userpaths from being logged to error reporting.

  • Adds a new function sanitizeUserpath to sanitize-for-logging utilities
  • Updates user-data console.log events to remove matching userpaths from logged events

Currently, the userpath pattern matcher matches /Users/username/Library/..., which is the expected userpath on macOS. When release support for other operating systems is added, this userpath matcher can be expanded to cover the new OS userpath patterns.

Testing Instructions

To test, observe that usernames are not included in userpaths when logging user-data functions:

Sentry

These functions can be observed on any Sentry session sending these events:

Screenshot 2024-05-02 at 10 30 34 pm

Local Development

Another more straightforward way to test the changes is to observe the same events when logged while running Studio locally:
Screenshot 2024-05-02 at 10 31 28 pm

Pre-merge Checklist

  • Have you checked for TypeScript, React or other console errors?

@derekblank derekblank self-assigned this May 2, 2024
@derekblank derekblank requested review from a team and dcalhoun May 2, 2024 12:41
// Matches expected user path on macOS
if ( typeof path === 'string' && /\/Users\/[^/]+\/Library/.test( path ) ) {
return path.replace( /\/Users\/[^/]+(\/Library)/, '~$1' );
}
Copy link
Contributor Author

@derekblank derekblank May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the userpath pattern matcher matches /Users/{username}/../.., which is the expected userpath on macOS (presently, the only OS that has been released).

When support for other operating systems is added, this userpath matcher can be expanded to cover the new OS userpath patterns. Also noting that this util will return the original string if not matching the pattern.

Open to feedback if we want to include other OS userpath matchers now, however (Windows, Linux).

Copy link
Member

@dcalhoun dcalhoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test plan succeeded for me. Thank you for address this. 🙇🏻

I note that logged server details include usernames in the projectPath and wpContentPath attributes. However, I was unable to find an instance of those values in Sentry. So, we may not need to scrub those. I'm not sure. WDYT?

console.log( 'Starting server with options', sanitizeForLogging( options ) );

export function sanitizeUserpath( path: string ): string {
const userPathRegex = /^\/Users\/([^/]+)\//;
if ( typeof path === 'string' && userPathRegex.test( path ) ) {
return path.replace( userPathRegex, '~/' );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think replacing with a tilde is sensible.

I wondered if we should instead rely upon the existing practice of inserting REDACTED instead, e.g. /Users/REDACTED/. That might more clearly communicate an action was taken on the string.

Also, when we support other OSs, I supposed need to update it to use the equivalent for macOS' tilde usage — e.g., my understanding is that Windows might use C:/.

Not necessarily requesting a change. Merely sharing thoughts.

// Returns the original path if it does not match the pattern or if it is not a string.
export function sanitizeUserpath( path: string ): string {
const userPathRegex = /^\/Users\/([^/]+)\//;
if ( typeof path === 'string' && userPathRegex.test( path ) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using both test and replace may be unnecessary. If replace doesn't find a pattern match, it would have no impact. Are we using test to guard against something in particular?

Suggested change
if ( typeof path === 'string' && userPathRegex.test( path ) ) {
if ( typeof path === 'string' ) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I suppose our usage of TypeScript should, in theory, negate a need to check with typeof, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, I had this regex matcher as part of sanitizeForLogging function itself, but decided it was too convoluted to accept both key/value objects and strings. Using a dedicated function for sanitizeUserpath is more readable, and these extra typechecks are no longer needed. Updated below, and thanks for catching.

@derekblank
Copy link
Contributor Author

derekblank commented May 3, 2024

I wondered if we should instead rely upon the existing practice of inserting REDACTED instead, e.g. /Users/REDACTED/. That might more clearly communicate an action was taken on the string.

Also, when we support other OSs, I supposed need to update it to use the equivalent for macOS' tilde usage — e.g., my understanding is that Windows might use C:/.

@dcalhoun Thanks for the feedback - I think these are good suggestions.

Given that Windows support is under active development and release is intended to be pending soon-ish, I think it's a good idea to go ahead and accommodate Windows and Unix together in this PR.

The test cases probably make the intent clearer, but I've made the following changes:

  • Updated the regex pattern matcher to account for forward/backslash differences Windows and macOS userpaths, as well as the Windows drive letter.
  • Used [REDACTED] in place of ~ to match the spirit of our other sanitize utils.
  • Updated sanitizeUserpath to not sanitize the userpath in development mode -- It's redacted when sent to Sentry, but perhaps users may want to see their unredacted userpath when developing locally? (Open to feedback.)
  • Removed unneeded typechecks.

Let me know if you have further thoughts on this approach.

@derekblank
Copy link
Contributor Author

I note that logged server details include usernames in the projectPath and wpContentPath attributes. However, I was unable to find an instance of those values in Sentry. So, we may not need to scrub those. I'm not sure. WDYT?

I looked at these as well, and my understanding is that the filtering here is handled either by sanitizeForLogging, or by sanitizeUnstructuredData.

Initially, I attempted to leverage either of these functions to also handle the regex matching for the userpath, but ultimately decided that sanitizeUserpath was best served with its own function as it did not utilize a key/value pair, unlike the other two sanitization functions. (This is also where the extra typechecks in the initial function came from -- thanks for calling out that those were no longer needed.)

@derekblank derekblank requested a review from dcalhoun May 3, 2024 03:00
Copy link
Contributor

@katinthehatsite katinthehatsite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The testing steps succeeded for me. Here is what I did:

  • First, I tested on trunk to confirm the issue - the userpath was not redacted there:
Screenshot 2024-05-03 at 10 14 53 AM
  • Next, I switched to the branch and observed that the userpath is not redacted either since I am running the path in the development mode.

  • I removed the following condition:

if ( process.env.NODE_ENV !== 'development' ) {

And then ran nvm use && npm install && npm start and the userpath was redacted as expected:

Screenshot 2024-05-03 at 10 19 20 AM

The changes seem to work on my end on macOs. Please note that I have not tested this on Windows so I am only confirming the functionality for macOS.

@katinthehatsite
Copy link
Contributor

Updated sanitizeUserpath to not sanitize the userpath in development mode -- It's redacted when sent to Sentry, but perhaps users may want to see their unredacted userpath when developing locally?

I don't have a very strong preference for this one. Actually, I like seeing the path when the app is in the dev mode in case I need to locate something quickly e.g. appdata-v1.json (especially when testing many PRs and needing to change many things). 👍

@dcalhoun dcalhoun merged commit 380d324 into trunk May 3, 2024
@dcalhoun dcalhoun deleted the fix/username-paths-in-sentry branch May 3, 2024 12:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants