Skip to content

struct.unpack inference #2437

@sakgoyal

Description

@sakgoyal

Question

Can the type checker support introspecting the struct.unpack format string to infer the return type? currently it returns Any, but i'd like it to return a specified type if possible (when a literal string is provided).

I am trying to add types to an untyped library, but it uses a lot of struct unpacking which results in a lot of manual casts or Any's everywhere.

I already wrote some python code that seems to get the type at runtime. but it's just a PoC to show that it's possible to do.

import re
from typing import cast

type TypeMap = type[int | bool | str | bytes | float]

_TYPE_MAP: dict[str, TypeMap] = {c: int for c in 'bBhHiIlLqQnNP'}
_TYPE_MAP.update(dict.fromkeys('efd', float))
_TYPE_MAP.update(dict.fromkeys('c', bytes))
_TYPE_MAP['?'] = bool

def get_unpack_type(fmt: str):
    core = fmt.lstrip('@=<>!')

    types: list[TypeMap] = []
    for count, tag in cast(list[tuple[str, str]], re.findall(r'(\d*)([xcbB?hHiIlLqQnNefdspP])', core)):
        if tag == 'x':
            continue
        if tag in 'sp':
            types.append(bytes)
        else:
            t = _TYPE_MAP.get(tag, int)
            n = int(count) if count else 1
            types.extend([t] * n)
    if len(types) == 1 and 'x' not in core:
        return types[0]
    return tuple[tuple(types)]


if __name__ == '__main__':
    test_cases = [
        ('2xH', tuple[int]),
        ('@i', int),
        ('=i', int),
        ('c3x', tuple[bytes]),
        ('2c', tuple[bytes, bytes]),
        ('5c', tuple[bytes, bytes, bytes, bytes, bytes]),
        ('0s', bytes),
        ('1s', bytes),
        ('255s', bytes),
        ('e', float),
        ('2e', tuple[float, float]),
        ('e4x', tuple[float]),
        ('3eH', tuple[float, float, float, int]),
        ('?x?', tuple[bool, bool]),
        ('2?', tuple[bool, bool]),
        ('?2xI', tuple[bool, int]),
        ('fd4x', tuple[float, float]),
        ('d2xH', tuple[float, int]),
        ('2i4x2h', tuple[int, int, int, int]),
        ('iP', tuple[int, int]),
        ('>n2xN', tuple[int, int]),
        ('!P4x', tuple[int]),
    ]

Version

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    wishNot on the current roadmap; maybe in the future

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions