Skip to content

Conversation

@koxudaxi
Copy link
Contributor

@koxudaxi koxudaxi commented Jun 24, 2025

Implements PEP 750 template strings for MicroPython.

Started in discussion #17497. Template strings return Template objects instead of strings, allowing access to literal parts and expressions separately.

Changes:

  • Lexer/parser support for t-string syntax
  • Template and Interpolation types
  • Public type definitions in py/objtstring.h
  • Minimal builtin string.templatelib module
  • Conditional build with MICROPY_PY_TSTRINGS

Usage:

from string.templatelib import Template

t = t"Hello {name}!"
t.strings         # ('Hello ', '!')
t.values          # ('World',)
t.interpolations[0].expression  # 'name'

t"{x!r:>10}"      # Conversions: !r, !s, !a
rt"C:\{file}"     # Raw t-strings

Implementation:

  • Lexer (py/lexer.c) and parser (py/parse.c) handle tokenization
  • Template/Interpolation types in py/modtstring.c, py/objinterpolation.c
  • Public header py/objtstring.h eliminates code duplication
  • Minimal string module in py/modstring.c (no manifest.py changes)
  • Dynamic array doubling (4→8→16→...→2048), GC-safe
  • CPython 3.14 compatible error messages

Testing:

Tested on unix (coverage, standard, minimal), windows, webassembly.

New tests in tests/basics/:

  • string_module_tstring.py
  • tstring_basic.py, tstring_parser.py, tstring_errors.py
  • tstring_format.py, tstring_operations.py, tstring_coverage.py

Feature detection in tests/feature_check/tstring.py - automatic skip when MICROPY_PY_TSTRINGS=0.

All existing tests pass.

Trade-offs:

Code size: ~10 KB when enabled, zero when disabled.

String module: minimal builtin with only templatelib blocks micropython-lib's full string module. Future: coordinate with micropython-lib.

Config: enabled when MICROPY_CONFIG_ROM_LEVEL >= EXTRA_FEATURES, requires MICROPY_PY_FSTRINGS=1.

To disable: make CFLAGS_EXTRA=-DMICROPY_PY_TSTRINGS=0

@WebReflection
Copy link
Contributor

for what is worth it, we'd love to have this available at least for the PyScript WASM variant as this unlocks tons of UI related use cases we'd like to deliver to our users.

/cc @dpgeorge @ntoll

@koxudaxi koxudaxi changed the title py: Add PEP 750 template strings support. py: Add PEP 750 template strings support Jun 24, 2025
@codecov
Copy link

codecov bot commented Jun 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.43%. Comparing base (ef567dc) to head (d0f0ff6).

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #17557      +/-   ##
==========================================
+ Coverage   98.38%   98.43%   +0.05%     
==========================================
  Files         171      174       +3     
  Lines       22300    23038     +738     
==========================================
+ Hits        21939    22677     +738     
  Misses        361      361              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link

github-actions bot commented Jun 25, 2025

Code size report:

Reference:  tests/basics/string_fstring.py: Test fstring nested replacement fields. [ef567dc]
Comparison: Merge branch 'micropython:master' into feature/pep750-template-strings [merge of d0f0ff6]
  mpy-cross: +7144 +1.884% [incl +128(data)]
   bare-arm:   +16 +0.028% 
minimal x86:  +126 +0.067% 
   unix x64: +14592 +1.695% standard[incl +896(data)]
      stm32:   +72 +0.018% PYBV10
     mimxrt:   +64 +0.017% TEENSY40
        rp2:   +80 +0.009% RPI_PICO_W
       samd:   +72 +0.026% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:   +60 +0.013% VIRT_RV32

@dpgeorge dpgeorge added the py-core Relates to py/ directory in source label Jun 25, 2025
Copy link
Member

@dpgeorge dpgeorge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! This will be nice to have for the webassembly port.

I didn't do a review yet, but there will need to be tests to get full coverage of the new code.

@koxudaxi
Copy link
Contributor Author

@dpgeorge Thank you for the feedback and for taking time to look at this! I'm glad this will be useful for the webassembly port.

I'll add more tests to ensure full coverage when I find time.

@ntoll
Copy link
Contributor

ntoll commented Jun 25, 2025

@koxudaxi slightly off topic - but I notice you'll be at EuroPython in Prague, as will I. We should look out for each other and have a coffee or lunch together! 🇪🇺 🐍

@koxudaxi
Copy link
Contributor Author

@ntoll Sounds good! See you in Prague. ☕

@koxudaxi koxudaxi force-pushed the feature/pep750-template-strings branch 10 times, most recently from 7a1dc11 to 50dd4d1 Compare July 2, 2025 15:42
@koxudaxi koxudaxi requested a review from dpgeorge July 2, 2025 15:59
@koxudaxi
Copy link
Contributor Author

koxudaxi commented Jul 2, 2025

@dpgeorge

I didn't do a review yet, but there will need to be tests to get full coverage of the new code.

I tried really hard to make 100% coverage but some code is never called and I cannot cover it. Do you know how to fix this?

@koxudaxi
Copy link
Contributor Author

koxudaxi commented Jul 2, 2025

@dpgeorge
I'm not very familiar with the micropython codebase, so please let me know if you notice any issues with how I'm handling memory constraints. I think everything looks good, but I'd appreciate your review.

return MP_OBJ_FROM_PTR(result);
}

case MP_BINARY_OP_REVERSE_ADD: {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just remove this case, it's not needed.

@koxudaxi koxudaxi force-pushed the feature/pep750-template-strings branch 4 times, most recently from a567601 to a5ced29 Compare November 17, 2025 16:39
Comment on lines +20 to +22
# Whitespace in expression (MicroPython preserves, CPython strips trailing)
t_ws = t"{ 42 }"
print(f"Whitespace: {str(t_ws)}")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behaves differently from CPython, but keeping it unchanged for now since CPython may have a bug.
Currently checking with Core developers to confirm if it's a bug.

@koxudaxi koxudaxi force-pushed the feature/pep750-template-strings branch 5 times, most recently from 1782b36 to 4037aad Compare November 17, 2025 18:58
@koxudaxi
Copy link
Contributor Author

@AJMansfield
I've addressed your feedback and reorganized the commits. Could you please review it again?

@koxudaxi koxudaxi force-pushed the feature/pep750-template-strings branch from 29e8816 to bdf9e9e Compare November 26, 2025 14:09
@koxudaxi koxudaxi force-pushed the feature/pep750-template-strings branch from bdf9e9e to e66ee9a Compare December 5, 2025 17:10
@koxudaxi koxudaxi force-pushed the feature/pep750-template-strings branch from e66ee9a to 7292a59 Compare December 5, 2025 17:47
@koxudaxi koxudaxi force-pushed the feature/pep750-template-strings branch from 7292a59 to d0b2c33 Compare December 5, 2025 18:12
@koxudaxi
Copy link
Contributor Author

koxudaxi commented Dec 5, 2025

After taking another look at this PR for the first time in a while, I noticed that coverage was insufficient due to some dead code that had been left behind, so I removed it. However, there now appear to be errors in the Windows environment and the esp32 port that seem unrelated to this change.
Please review when you have time.

@dpgeorge
Copy link
Member

@koxudaxi I had a good look at this PR and how it parses t-strings.

From what I understand, it creates a new nested parser when it encounters a t-string, in order to parse the expression parts of the t-string.

That seems rather complicated and probably unnecessary, at least for MicroPython's implementation of t-strings.

As far as I understand, t-strings are syntatically equivalent to f-strings (except that the f prefix is a t). So I'm wondering if it would be possible to reuse the f-string parser that we already have?

The way f-strings work at the moment is that the lexer dynamically transforms the incoming f-string bytes into a new set of bytes for a corresponding str-format expression, like this:

f"ab{12}cd{34}"  -->  "ab{}cd{}".format(12, 34)

And then the lexer itself tokenizes the transformed bytes. So the parser only ever sees the RHS of the above, the parser never sees any f-strings.

Could we make it so t-strings worked the same way? Something like:

t"ab{12}cd{34}"  -->  __template__("ab", 12, "cd", 34, "")

That could reuse most of the code for f-string parsing, and make the implementation minimal and quite efficient.

That doesn't support nested t-strings, but I don't think we need that. We don't support nested f-strings.

@WebReflection
Copy link
Contributor

@dpgeorge

That doesn't support nested t-strings, but I don't think we need that. We don't support nested f-strings.

... ouch ... for the most common/popular use case, which is X/HTML representation, does that mean that this code would not work neither, because you cannot have an f string within a t one?

div = html(t"<div>{f"Hello {user}!"}</div>")

This might be a blocker in terms of ergonomics / capabilities, but maybe it won't be the end of the world (as in: one simply factors out via lambdas all partial f or t and call it a day), thanks!

@koxudaxi
Copy link
Contributor Author

@dpgeorge
Thanks for looking at this in detail!
A clarification on "nesting", based on recent discussion around HTML/tdom-style usage with @WebReflection.
There are (at least) two different things people call "nesting":

  1. using an f-string or t-string literal as a normal expression inside a t-string interpolation, e.g. t"<div>{f\"Hello {user}!\"}</div>" or t"outer { t'inner {x}' }" (tdom relies on this pattern)
  2. nesting inside format specs, e.g. t"{value:{width}.{precision}f}"
    For tdom/HTML composition, (1) is the non‑negotiable case, while (2) is nice-to-have at best.

Even if we ignore (2) entirely, I don’t think a simple lexer-rewrite approach that only scans for delimiters (similar to the current f-string handling) is robust enough for (1), because inner {...} from the f/t-string literal can interfere with outer boundary detection. That’s why this PR treats the {...} portions as real expressions and parses them accordingly.
I understand the RAM/flash concerns, especially around retaining Interpolation.expression. To keep this “micro”, I’m limiting enablement to larger ports (unix/windows/pyscript) for now. If we still want this on more constrained targets, we could also gate expression retention behind a build flag/config option.

@dpgeorge
Copy link
Member

2. nesting inside format specs, e.g. t"{value:{width}.{precision}f}"

Actually, MicroPython does already support this form of nesting in f-strings! We just had a test added for that in #18495. So it's not out of the question that f-strings can support nested f-strings using the current lexer-rewrite approach. And by extension, nested t-strings could also work with that approach.

Eg the lexer would have both an inject and fstring_args vstr, where inject holds characters that should be injected into the stream, while fstring_args holds characters that are currently being extracted from the f-string. Then when the f-string parsing is done (by the lexer) it copies fstring_args over to inject (possibly inserting it at the current location in inject to handle the nesting scenario), and then clears fstring_args to be ready for the next one.

I understand the RAM/flash concerns, especially around retaining Interpolation.expression. To keep this “micro”, I’m limiting enablement to larger ports (unix/windows/pyscript) for now. If we still want this on more constrained targets, we could also gate expression retention behind a build flag/config option.

IMO the .expression attribute has limited use, and would consume quite a bit of RAM. So might be good to have a compile-time option to turn that off.

@dpgeorge
Copy link
Member

For the record, the f-string feature currently costs about 470 bytes (on ARM Cortex-M), and this t-string implementation costs about 7400 bytes (on ARM Cortex-M, including the frozen code for string.templatelib).

@dpgeorge
Copy link
Member

In #18588 I modified the existing f-string parser to support nested f-strings, with nesting to arbitrary depth. I think the same algorithm could now be used for a t-string parser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

py-core Relates to py/ directory in source

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants