Skip to content

feat: Treat NonAsciiOrNonPrintableCharNotice as a warning instead of error#728

Merged
barbeau merged 1 commit intoMobilityData:masterfrom
aababilov:errors-in-row-parser
Feb 9, 2021
Merged

feat: Treat NonAsciiOrNonPrintableCharNotice as a warning instead of error#728
barbeau merged 1 commit intoMobilityData:masterfrom
aababilov:errors-in-row-parser

Conversation

@aababilov
Copy link
Copy Markdown
Collaborator

Issue #549

An error in RowParser stops validation for the whole feed. However, if
ID has non-printable or non-ASCII characters, the feed may be still
successfully parsed and served to the end user.

We are lowering priority of NonAsciiOrNonPrintableCharNotice to WARNING
and modify RowParser so that it checks severity level of its notices.

We also simplify the code of asId function and reduce amount of calls to
row.asString from 3 to 1.

@aababilov aababilov requested a review from barbeau February 8, 2021 10:04
@lionel-nj lionel-nj mentioned this pull request Feb 8, 2021
3 tasks
@lionel-nj lionel-nj linked an issue Feb 8, 2021 that may be closed by this pull request
@aababilov
Copy link
Copy Markdown
Collaborator Author

Friendly ping here! That's a blocker because right now our upstream validator rejects valid feeds that have non-ASCII characters in IDs.

Copy link
Copy Markdown
Member

@barbeau barbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aababilov! Converting this to a warning LGTM, as the spec says it's recommended (but not required) to use printable ASCII:

Using only printable ASCII characters is recommended.

A few comments in-line

@@ -0,0 +1,58 @@
/*
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this has a dependency on #729 as both commits and included here (although I believe the new test for NonAsciiOrNonPrintableCharNotice is missing). Can that dependency be removed so we can merge this separately?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. My bad, GitHub is a bit challenging when you have dependent pull requests :)

"csvRowNumber", csvRowNumber,
"columnName", columnName));
"columnName", columnName),
SeverityLevel.INFO);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be SeverityLevel.WARNING?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, done.

row.getFileName(), row.getRowNumber(), row.getColumnName(columnIndex)));
}
if (s != null) {
final String trimmed = s.strip();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - given that there is also a Java method trim(), this variable is probably better named stripped.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

By the way, can you tell the difference between "strip" and "trim" as a native English speaker? To me, these words are interchangeable and one must check the documentation to understand the difference :)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that change belongs to another pull request. Fixed.

#729

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, can you tell the difference between "strip" and "trim" as a native English speaker?

Not in the context of Java :). In general use "strip" has somewhat of a more aggressive and coarse connotation than "trim". "trim" implies a more precise and neat action.

@barbeau barbeau changed the title feat: Do not treat NonAsciiOrNonPrintableCharNotice as an error feat: Treat NonAsciiOrNonPrintableCharNotice as a warning instead of error Feb 9, 2021
Issue MobilityData#549

An error in RowParser stops validation for the whole feed. However, if
ID has non-printable or non-ASCII characters, the feed may be still
successfully parsed and served to the end user.

We are lowering priority of NonAsciiOrNonPrintableCharNotice to WARNING
and modify RowParser so that it checks severity level of its notices.

We also simplify the code of asId function and reduce amount of calls to
row.asString from 3 to 1.
@aababilov aababilov force-pushed the errors-in-row-parser branch from 163fe26 to e9a5462 Compare February 9, 2021 21:47
Copy link
Copy Markdown
Member

@barbeau barbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aababilov!

@barbeau barbeau merged commit d48c507 into MobilityData:master Feb 9, 2021
@aababilov
Copy link
Copy Markdown
Collaborator Author

Thank you for review!

@lionel-nj lionel-nj mentioned this pull request Feb 10, 2021
2 tasks
@aababilov aababilov deleted the errors-in-row-parser branch April 20, 2021 02:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Non ascii or non printable char in id

2 participants