Skip to content

bug: 'in' / 'not_in' operator on strings is completely unwired — leaks to C compilation #1590

@SchoolyB

Description

@SchoolyB

Description

The in, not_in, and !in operators have no support for strings. The typechecker silently accepts them, but the codegen treats the string as an array and emits ez_arrays_contains_int() calls — passing an EzString* where an EzArray* is expected. This causes C compiler warnings or errors depending on the left operand type.

This is fully unwired: no typechecker validation, no codegen path.

Reproduction

char in string (should work — most intuitive use case):

do main() {
    mut s string = "hello"
    if 'h' in s {
        println("found h")
    }
}

Expected: Prints found h

Actual: C warning + wrong behavior:

warning: incompatible pointer types passing 'EzString *' to parameter of type 'EzArray *'
    if (ez_arrays_contains_int(&s, 'h')) {

string in string (substring check):

do main() {
    mut s string = "hello world"
    if "world" in s {
        println("found world")
    }
}

Expected: Either works as a substring check, or compile error if unsupported

Actual: C compilation fails:

error: passing 'EzString' to parameter of incompatible type 'int64_t'
    if (ez_arrays_contains_int(&s, ez_string_lit("world"))) {

All variants fail the same way:

// These all leak to C errors/warnings:
if 'z' not_in s { }     // char not_in string
if 'z' !in s { }        // char !in string
if 5 in s { }           // int in string (should be type error)
if true in s { }        // bool in string (should be type error)

// Variables too:
mut c char = 'e'
if c in s { }           // char var in string — same C warning

Root Cause

The codegen's in handler only has paths for arrays and maps. When the right operand is a string, it falls through to the array path and emits ez_arrays_contains_int(&string_var, value) — treating the EzString as an EzArray.

The typechecker also has no validation for in on strings — it doesn't reject invalid combinations (like int in string) and doesn't wire up valid ones (like char in string).

Fix

Two things need to happen:

1. Codegen — add string path for in / not_in

When the right operand is TK_STRING, emit a string-specific contains check:

  • char in string: Iterate bytes or use a runtime helper to check if the character exists in the string
  • string in string: Use a substring search (e.g., strstr() or a runtime helper like ez_string_contains())

2. Typechecker — validate in operand types for strings

When the right operand is a string, the left operand must be either:

  • char — character membership check
  • string — substring check

Any other type (int, bool, float, array, etc.) should be rejected with a type error.

Notes

  • for_each c in string already works correctly and yields char values — so char in string should be the natural companion
  • The strings stdlib module has strings.contains() and strings.index_of() which do substring checks at the library level — in should provide the same functionality as a language-level operator
  • This is related to but separate from bug: 'in' operator does not validate type compatibility between value and collection elements #1589 (type mismatch validation for in on arrays/maps)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcodegenRelated to C code generationcrashCode or instance of a crash at build/runtimetypecheckerRelated to type checking and validation

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions