Skip to content

Set Flake8 encoding to UTF-8#11108

Merged
feerrenrut merged 2 commits intonvaccess:masterfrom
accessolutions:i11080-flake8
May 4, 2020
Merged

Set Flake8 encoding to UTF-8#11108
feerrenrut merged 2 commits intonvaccess:masterfrom
accessolutions:i11080-flake8

Conversation

@JulienCochuyt
Copy link
Contributor

Link to issue number:

Follow-up of PR #11081

Summary of the issue:

Flake8 encodes its output with the default encoding.
Without further instruction, Python chooses the default encoding of the system to do so.
When a line of source code contains both a linting error and a non-ASCII character, it currently leads to either the line being wrongly encoded, or worse the line being replaced by the stack trace of an UnicodeEncodeError.

For an example of this behavior, edit source/globalCommands.py and set the copyright year to 2020, then run scons lint.
Flake8 should complain about both the comment not starting with "# " and the line being too long.
If, like in my CP1252 Windows default encoding, the character "Ł" cannot be represented, you'll end up with the following output:

Traceback (most recent call last):
  File "C:\dev\Python37-32\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\dev\Python37-32\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\__main__.py", line 4, in <module>
    cli.main()
  File "C:\dev\venv\nvda\lib\site-packages\flake8\main\cli.py", line 18, in main
    app.run(argv)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\main\application.py", line 393, in run
    self._run(argv)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\main\application.py", line 382, in _run
    self.report()
  File "C:\dev\venv\nvda\lib\site-packages\flake8\main\application.py", line 373, in report
    self.report_errors()
  File "C:\dev\venv\nvda\lib\site-packages\flake8\main\application.py", line 334, in report_errors
    results = self.file_checker_manager.report()
  File "C:\dev\venv\nvda\lib\site-packages\flake8\checker.py", line 265, in report
    results_reported += self._handle_results(filename, results)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\checker.py", line 167, in _handle_results
    physical_line=physical_line,
  File "C:\dev\venv\nvda\lib\site-packages\flake8\style_guide.py", line 418, in handle_error
    code, filename, line_number, column_number, text, physical_line
  File "C:\dev\venv\nvda\lib\site-packages\flake8\style_guide.py", line 565, in handle_error
    self.formatter.handle(error)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\formatting\base.py", line 93, in handle
    self.write(line, source)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\formatting\base.py", line 203, in write
    self._write(source)
  File "C:\dev\venv\nvda\lib\site-packages\flake8\formatting\base.py", line 182, in _write
    self.output_fd.write(output + self.newline)
  File "C:\dev\Python37-32\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0141' in position 176: character maps to <undefined>

By the way, all apologies to @lukaszgo1: it seems I might have misspelled your first name in the past as I did not notice earlier its first letter wasn't "L", but rather "Ł".

Description of how this pull request fixes the issue:

Take advantage of the new UTF-8 Mode introduced by PEP 540 in Python 3.7: Add the -Xutf8 command-line argument.

Testing performed:

Linted the above described edit with and without the proposed change.

Known issues with pull request:

Change log entry:

I don't think this deserves a change log entry.

@JulienCochuyt
Copy link
Contributor Author

@feerrenrut, sorry I did not yet know how to solve this one when I filed PR #11081.

Copy link
Contributor

@feerrenrut feerrenrut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be less likely to bite us again if we add the # -*- coding: utf-8 -*- to the top of the file, we could then add an encoding check (like the flake8 linter) to ensure that all files have the expected encoding (and no bom marks etc)

@JulienCochuyt
Copy link
Contributor Author

Would it be less likely to bite us again if we add the # -*- coding: utf-8 -*- to the top of the file, we could then add an encoding check (like the flake8 linter) to ensure that all files have the expected encoding (and no bom marks etc)

I really don't think it would make a difference.
Many encodings can safely (even if wrongly) decoded as UTF-8, and BOM marks are strictly-speaking valid in this encoding.
Nevertheless, testing all files to ensure no BOM marks are present and no character cannot be decoded sounds like a good idea.

Copy link
Contributor

@feerrenrut feerrenrut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@feerrenrut feerrenrut merged commit cdd37c7 into nvaccess:master May 4, 2020
@nvaccessAuto nvaccessAuto added this to the 2020.1 milestone May 4, 2020
@lukaszgo1
Copy link
Contributor

@feerrenrut What is the reason behind ApVeyor script not invoking tests using Sconscript? While you were aware of it, presumably because you've written the code, it is not obvious, and having only one code path seems less likely to introduce differences between what developers are using and what is used on AppVeyor.

@feerrenrut
Copy link
Contributor

What is the reason behind ApVeyor script not invoking tests using Sconscript?

@lukaszgo1 I agree, it's not obvious or ideal when changing options! It's a trade-off with trying to reduce the build time. Scons takes a long time to initialize our buildscripts, and we do many more builds than edits to these files. Happy to hear about proposals to improve this!

@lukaszgo1
Copy link
Contributor

@feerrenrut wrote:

What is the reason behind ApVeyor script not invoking tests using Sconscript?

@lukaszgo1 I agree, it's not obvious or ideal when changing options! It's a trade-off with trying to reduce the build time. Scons takes a long time to initialize our buildscripts, and we do many more builds than edits to these files. Happy to hear about proposals to improve this!

My first thought would be not to use Scons for these tests at all. It seems overkill. It should be possible to add additional parameters to scons.py to run specific tests.

@feerrenrut
Copy link
Contributor

The support for running the tests via scons is to make it easier for developers to run the tests on their own machines. Hopefully this makes it more likely they do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants