Skip to content

ruff does not honor declaration of character coding #6791

@PeterSlickers

Description

@PeterSlickers

According to PEP263, a character encoding can be declared in a Python program file. This is done with a specially formatted comment placed in the first or second line of the program:

#!/usr/bin/python
# -*- coding: latin-1 -*-

It seems that Ruff (0.0.285) does not honor the coding declaration. Ruff seems to assume that input files are always encoded with utf8. The following Python program demonstrates the problem. It first generates three short Python program files with different encodings and than runs ruff and python3 on them.

#!/usr/bin/env python3
# -*- coding: us-ascii -*-

import subprocess


prog = """# -*- coding: {} -*-
print(\"\u00D8resund og Sj\u00E6lland\")
"""

## create artefacts with differing encodings
filenames = []

filenames.append("prog-utf8.py")
print(f"writing file '{filenames[-1]}'")
with open(filenames[-1], "wb") as outstream:
	outstream.write(prog.format("utf8").encode("utf8"))	

filenames.append("prog-usascii.py")
print(f"writing file '{filenames[-1]}'")
with open(filenames[-1], "wb") as outstream:
	# declared encoding differs from the true encoding
	outstream.write(prog.format("us-ascii").encode("utf8"))	

filenames.append("prog-latin1.py")
print(f"writing file '{filenames[-1]}'")
with open(filenames[-1], "wb") as outstream:
	outstream.write(prog.format("latin-1").encode("latin1"))	

## re-check encodings of the artefacts
print("\nTrue encodings")	
for filename in filenames:
	subprocess.call(["file", "-i", filename,])

## run python3 and ruff on the artefacts
for filename in filenames:
	cmd = ["python3", filename,]
	print("---\n" + " ".join(cmd))
	subprocess.call(cmd)
	cmd = ["ruff", filename,]
	print("\n" + " ".join(cmd))
	subprocess.call(cmd)

The first file with utf8 encoding runs flawlessly with ruff and with python3. This is the expected behaviour.

The second file comprises characters in utf8 encoding, but wrongly declares us-ascii encoding. This file throws an error when run with python3, but successfully passes ruff. I would expect that ruff complains on this file.

The third file comprises characters in latin1 encoding and correctly declares its encoding. This program runs successfully with python3, but throws an error when checked with ruff. I would expect that ruff does not complain on this file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-decisionAwaiting a decision from a maintainer

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions