Set Flake8 encoding to UTF-8 by JulienCochuyt · Pull Request #11108 · nvaccess/nvda

JulienCochuyt · 2020-05-04T16:19:43Z

Link to issue number:

Follow-up of PR #11081

Summary of the issue:

Flake8 encodes its output with the default encoding.
Without further instruction, Python chooses the default encoding of the system to do so.
When a line of source code contains both a linting error and a non-ASCII character, it currently leads to either the line being wrongly encoded, or worse the line being replaced by the stack trace of an UnicodeEncodeError.

For an example of this behavior, edit source/globalCommands.py and set the copyright year to 2020, then run scons lint.
Flake8 should complain about both the comment not starting with "# " and the line being too long.
If, like in my CP1252 Windows default encoding, the character "Ł" cannot be represented, you'll end up with the following output:

Traceback (most recent call last):
  File "C:\dev\Python37-32\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\dev\Python37-32\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\__main__.py", line 4, in <module>
    cli.main()
  File "C:\dev\venv\nvda\lib\site-packages\flake8\main\cli.py", line 18, in main
    app.run(argv)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\main\application.py", line 393, in run
    self._run(argv)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\main\application.py", line 382, in _run
    self.report()
  File "C:\dev\venv\nvda\lib\site-packages\flake8\main\application.py", line 373, in report
    self.report_errors()
  File "C:\dev\venv\nvda\lib\site-packages\flake8\main\application.py", line 334, in report_errors
    results = self.file_checker_manager.report()
  File "C:\dev\venv\nvda\lib\site-packages\flake8\checker.py", line 265, in report
    results_reported += self._handle_results(filename, results)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\checker.py", line 167, in _handle_results
    physical_line=physical_line,
  File "C:\dev\venv\nvda\lib\site-packages\flake8\style_guide.py", line 418, in handle_error
    code, filename, line_number, column_number, text, physical_line
  File "C:\dev\venv\nvda\lib\site-packages\flake8\style_guide.py", line 565, in handle_error
    self.formatter.handle(error)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\formatting\base.py", line 93, in handle
    self.write(line, source)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\formatting\base.py", line 203, in write
    self._write(source)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\formatting\base.py", line 182, in _write
    self.output_fd.write(output + self.newline)
  File "C:\dev\Python37-32\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0141' in position 176: character maps to <undefined>

By the way, all apologies to @lukaszgo1: it seems I might have misspelled your first name in the past as I did not notice earlier its first letter wasn't "L", but rather "Ł".

Description of how this pull request fixes the issue:

Take advantage of the new UTF-8 Mode introduced by PEP 540 in Python 3.7: Add the -Xutf8 command-line argument.

Testing performed:

Linted the above described edit with and without the proposed change.

Known issues with pull request:

Change log entry:

I don't think this deserves a change log entry.

JulienCochuyt · 2020-05-04T16:20:53Z

@feerrenrut, sorry I did not yet know how to solve this one when I filed PR #11081.

feerrenrut

Would it be less likely to bite us again if we add the # -*- coding: utf-8 -*- to the top of the file, we could then add an encoding check (like the flake8 linter) to ensure that all files have the expected encoding (and no bom marks etc)

tests/lint/sconscript

Re: nvaccess#11108 (comment)

JulienCochuyt · 2020-05-04T16:45:13Z

Would it be less likely to bite us again if we add the # -*- coding: utf-8 -*- to the top of the file, we could then add an encoding check (like the flake8 linter) to ensure that all files have the expected encoding (and no bom marks etc)

I really don't think it would make a difference.
Many encodings can safely (even if wrongly) decoded as UTF-8, and BOM marks are strictly-speaking valid in this encoding.
Nevertheless, testing all files to ensure no BOM marks are present and no character cannot be decoded sounds like a good idea.

feerrenrut

Thanks @JulienCochuyt

lukaszgo1 · 2020-05-04T18:46:03Z

@feerrenrut What is the reason behind ApVeyor script not invoking tests using Sconscript? While you were aware of it, presumably because you've written the code, it is not obvious, and having only one code path seems less likely to introduce differences between what developers are using and what is used on AppVeyor.

feerrenrut · 2020-05-05T15:50:15Z

What is the reason behind ApVeyor script not invoking tests using Sconscript?

@lukaszgo1 I agree, it's not obvious or ideal when changing options! It's a trade-off with trying to reduce the build time. Scons takes a long time to initialize our buildscripts, and we do many more builds than edits to these files. Happy to hear about proposals to improve this!

lukaszgo1 · 2020-06-21T13:36:19Z

@feerrenrut wrote:

What is the reason behind ApVeyor script not invoking tests using Sconscript?

@lukaszgo1 I agree, it's not obvious or ideal when changing options! It's a trade-off with trying to reduce the build time. Scons takes a long time to initialize our buildscripts, and we do many more builds than edits to these files. Happy to hear about proposals to improve this!

My first thought would be not to use Scons for these tests at all. It seems overkill. It should be possible to add additional parameters to scons.py to run specific tests.

feerrenrut · 2020-06-22T18:07:23Z

The support for running the tests via scons is to make it easier for developers to run the tests on their own machines. Hopefully this makes it more likely they do so.

Set Flake8 encoding to UTF-8

208eced

feerrenrut reviewed May 4, 2020

View reviewed changes

tests/lint/sconscript Show resolved Hide resolved

Set Flake8 encoding to UTF-8 when invoked by AppVeyor

3153c8d

Re: nvaccess#11108 (comment)

feerrenrut approved these changes May 4, 2020

View reviewed changes

feerrenrut merged commit cdd37c7 into nvaccess:master May 4, 2020

nvaccessAuto added this to the 2020.1 milestone May 4, 2020

JulienCochuyt mentioned this pull request May 4, 2020

Fix alpha updates and lint error reporting #10020

Merged

JulienCochuyt deleted the i11080-flake8 branch May 4, 2020 19:21

lukaszgo1 mentioned this pull request Nov 15, 2020

Allow linting without Visual studio installed #11774

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Set Flake8 encoding to UTF-8#11108

Set Flake8 encoding to UTF-8#11108
feerrenrut merged 2 commits intonvaccess:masterfrom
accessolutions:i11080-flake8

JulienCochuyt commented May 4, 2020

Uh oh!

JulienCochuyt commented May 4, 2020

Uh oh!

feerrenrut left a comment

Uh oh!

Uh oh!

JulienCochuyt commented May 4, 2020

Uh oh!

feerrenrut left a comment

Uh oh!

lukaszgo1 commented May 4, 2020

Uh oh!

feerrenrut commented May 5, 2020

Uh oh!

lukaszgo1 commented Jun 21, 2020

Uh oh!

feerrenrut commented Jun 22, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

JulienCochuyt commented May 4, 2020

Link to issue number:

Summary of the issue:

Description of how this pull request fixes the issue:

Testing performed:

Known issues with pull request:

Change log entry:

Uh oh!

JulienCochuyt commented May 4, 2020

Uh oh!

feerrenrut left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JulienCochuyt commented May 4, 2020

Uh oh!

feerrenrut left a comment

Choose a reason for hiding this comment

Uh oh!

lukaszgo1 commented May 4, 2020

Uh oh!

feerrenrut commented May 5, 2020

Uh oh!

lukaszgo1 commented Jun 21, 2020

Uh oh!

feerrenrut commented Jun 22, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants