Skip to content
This repository was archived by the owner on Apr 27, 2026. It is now read-only.
This repository was archived by the owner on Apr 27, 2026. It is now read-only.

Fix LiteralUTF8Char lowering for non-ASCII UTF-8 chars #208

@tmdeveloper007

Description

@tmdeveloper007

What actually is the problem ?

LiteralUTF8Char crashes during lowering when given a valid non-ASCII UTF-8 character such as é.

Details on the problem

On the latest main, translating code that contains a variable initialized with astx.LiteralUTF8Char("é") fails with a UnicodeEncodeError.

This is a correctness bug in Unicode handling because valid UTF-8 character literals should lower successfully.

Confirmiation and reproduction

Example:

import astx
from irx.builders.llvmliteir import LLVMLiteIR

builder = LLVMLiteIR()
module = builder.module()

block = astx.Block()
block.append(
    astx.VariableDeclaration(
        name="tmp",
        type_=astx.String(),
        value=astx.LiteralUTF8Char("é"),
    )
)
block.append(astx.FunctionReturn(astx.LiteralInt32(0)))

proto = astx.FunctionPrototype(
    name="main",
    args=astx.Arguments(),
    return_type=astx.Int32(),
)
fn = astx.FunctionDef(prototype=proto, body=block)
module.block.append(fn)

builder.translate(module)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions