-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
Closed
Labels
type-securityA security issueA security issue
Description
I just modified PyUnicode_AsUTF8() of the C API to raise an exception if a string contains an embedded null character to reduce the risk of security vulnerabilities. PyUnicode_AsUTF8() caller expects a string terminated by a null byte. If the UTF-8 encoded string contains embedded null byte, the caller is likely to truncate the string without knowing that there are more bytes after "the first" null byte.
It's not only about security issue, it can also just be seen as a bug: unwanted behavior.
Previous issues:
- [C API] Change PyUnicode_AsUTF8() to return NULL on embedded null characters #111089
- _winapi.LCMapStringEx fails when encountering a string containing null characters #106844
- os.path.normpath truncates input on null bytes in 3.11, but not 3.10 #106242 -- CVE-2023-41105
- Uncaught exception in
http.serverrequest handling (<=3.10) #103223 - embedded null byte when connecting to sqlite database using a bytes object #84335
- os.path.exists should not throw "Embedded NUL character" exception #73228
- "embedded NUL character" exceptions #66411
- sqlite3 doesn't complain if the request contains a null character #65346
- Reject embedded null characters in wchar* strings #57826
Discussions:
Example with Python 3.12:
import ctypes
libc = ctypes.cdll.LoadLibrary('libc.so.6')
printf = libc.printf
PyUnicode_AsUTF8 = ctypes.pythonapi.PyUnicode_AsUTF8
PyUnicode_AsUTF8.argtypes = (ctypes.py_object,)
PyUnicode_AsUTF8.restype = ctypes.c_char_p
my_string = "World\0truncated string"
printf(b"Hello %s\n", PyUnicode_AsUTF8(my_string))Output:
Hello World
The truncated string part is silently ignored!
Multiple functions were modified in the past to prevent this problem. Examples:
- _dbm.open(): check filename
- _gdbm.open(): check filename
PyBytes_AsStringAndSize(str, NULL)- grp.getgrnam(): check name
- pwd.getpwnam(): check name
- _locale.strxfrm(): check argument
- path_converter() of the os module: basically any filename and path
- PyUnicode_AsWideCharString()
- os.putenv()
- _posixsubprocess.fork_exec(): executable_list
- _struct.Struct: check format
- _tkinter SetVar() and varname_converter()
- _winapi.CreateProcess() getenvironment()
- PyUnicode_EncodeLocale()
- PyUnicode_EncodeFSDefault()
- unicode_decode_locale()
- PyUnicode_FSConverter()
- PyUnicode_DecodeLocale()
- PyUnicode_DecodeLocaleAndSize()
- PyUnicode_FSDecoder()
- PyUnicode_AsUTF8() -- recently modified
- _Py_stat(): check path
- getargs.c: 's', 'y' and 'z' formats
There are exceptions which accept embedded null bytes/characters:
- socket: AF_UNIX socket name
Metadata
Metadata
Assignees
Labels
type-securityA security issueA security issue