-
Notifications
You must be signed in to change notification settings - Fork 31.5k
Fix GIL=0 segfault and Add GIL=0 compat for regex paths #41329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Rocketknight1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a really cool idea! I made one comment about locked, but it makes a lot of sense otherwise.
One potential simplification, though, is that in a lot of cases we import regex when really we only need re; this is a leftover from a time when the built-in re was significantly behind the third-party regex lib. What's the thread-safety status of the built-in re?
Another issue about From docs |
ArthurZucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey! Nice PR, we are removing all of these files in favor of just using tokenizers as the backend, and having one sentencepiece_wrapper file, from which we will use your safe import!
|
But happy to review / merge in the mean time! |
…for all properties/helpers
|
@Rocketknight1 @ArthurZucker Ready for re-review/review. Changes since last review:
|
|
Got it! It seems good now, but can we confirm the status of the built-in Of course, if the built-in |
|
@Rocketknight1 I believe move to Based on added test (same one
Benchmark only tested 2 regex match patterns: simple + named grouped matching. Conclusion Transformers should move to (vm314t) root@gpu-base:~/transformers# PYTHON_GIL=0 python benchmark_re_vs_regex.py
Single-thread loops per matcher: 10000
Threaded loops: 8 threads x 10000 loops (PYTHON_GIL=0)
Pattern case: simple
expr: (Transformers|transformers|models|Models)
Single-thread
Library Time (s) Ops/s Matches Total Ops
----------- ---------- --------- --------- -----------
re 0.0057 1,752,847 10000 10000
regex 0.0592 169,017 10000 10000
safe.regex 0.0606 165,077 10000 10000
pcre2 0.0148 677,142 10000 10000
pcre2 (jit) 0.0146 682,972 10000 10000
Threaded
Library Time (s) Ops/s Matches Total Ops
----------- ---------- ------- --------- -----------
re 0.1749 457,282 80000 80000
safe.regex 1.8877 42,381 80000 80000
pcre2 0.3338 239,667 80000 80000
pcre2 (jit) 0.3201 249,944 80000 80000
Pattern case: named-groups
expr: (?P<head>\w+)\s+(?P<tail>\w+)
Single-thread
Library Time (s) Ops/s Matches Total Ops
----------- ---------- ------- --------- -----------
re 0.0125 802,368 10000 10000
regex 0.0793 126,103 10000 10000
safe.regex 0.0668 149,606 10000 10000
pcre2 0.0152 660,057 10000 10000
pcre2 (jit) 0.015 665,377 10000 10000
Threaded
Library Time (s) Ops/s Matches Total Ops
----------- ---------- ------- --------- -----------
re 0.2172 368,257 80000 80000
safe.regex 1.826 43,811 80000 80000
pcre2 0.2245 356,328 80000 80000
pcre2 (jit) 0.2467 324,233 80000 80000Benchmark code: #!/usr/bin/env python
"""Micro benchmark comparing stdlib re with transformers.utils.safe.regex and pcre2"""
import os
import re
import threading
import time
from typing import Callable, Sequence, Tuple
from tabulate import tabulate
from transformers.utils.safe import regex as safe_regex
import regex
import pcre2
SINGLE_THREAD_LOOPS = 10_000
THREAD_LOOPS = 10_000
THREAD_COUNT = 8
SIMPLE_PATTERN = r"(Transformers|transformers|models|Models)"
SIMPLE_TEXT = "Transformers by Hugging Face enable models across tasks and devices."
COMPLEX_PATTERN = r"(?P<head>\w+)\s+(?P<tail>\w+)"
COMPLEX_TEXT = "Transformers models"
def string_matcher_factory(
match_func: Callable[[str, str], object]
) -> Callable[[str, str], Callable[[], bool]]:
def factory(pattern: str, text: str) -> Callable[[], bool]:
def _match() -> bool:
return match_func(pattern, text) is not None
return _match
return factory
def pcre2_compiled_factory(jit: bool) -> Callable[[str, str], Callable[[], bool]]:
def factory(pattern: str, text: str) -> Callable[[], bool]:
compiled = _get_pcre2_pattern(pattern, jit=jit)
match = compiled.match
def _match() -> bool:
return match(text) is not None
return _match
return factory
def benchmark_single(label: str, match_callable: Callable[[], bool]):
start = time.perf_counter()
matches = 0
for _ in range(SINGLE_THREAD_LOOPS):
matches += match_callable()
duration = time.perf_counter() - start
ops_per_sec = SINGLE_THREAD_LOOPS / duration if duration else float("inf")
return duration, ops_per_sec, matches
def benchmark_threaded(label: str, match_callable: Callable[[], bool]):
total_matches = [0] * THREAD_COUNT
def worker(slot: int) -> None:
local_match = match_callable
count = 0
for _ in range(THREAD_LOOPS):
count += local_match()
total_matches[slot] = count
threads = [threading.Thread(target=worker, args=(idx,), name=f"{label}-worker-{idx}") for idx in range(THREAD_COUNT)]
start = time.perf_counter()
for thread in threads:
thread.start()
for thread in threads:
thread.join()
duration = time.perf_counter() - start
total_ops = THREAD_COUNT * THREAD_LOOPS
total = sum(total_matches)
ops_per_sec = total_ops / duration if duration else float("inf")
return duration, ops_per_sec, total
MatcherFactory = Callable[[str, str], Callable[[], bool]]
_PCRE2_CACHE: dict[tuple[str, bool], pcre2.Pattern] = {}
_PCRE2_CACHE_LOCK = threading.Lock()
def _get_pcre2_pattern(pattern: str, *, jit: bool) -> pcre2.Pattern:
key = (pattern, jit)
cached = _PCRE2_CACHE.get(key)
if cached is not None:
return cached
with _PCRE2_CACHE_LOCK:
cached = _PCRE2_CACHE.get(key)
if cached is None:
cached = pcre2.compile(pattern, jit=jit)
_PCRE2_CACHE[key] = cached
return cached
SINGLE_THREAD_LIBRARIES: Sequence[Tuple[str, MatcherFactory]] = (
("re", string_matcher_factory(re.match)),
("regex", string_matcher_factory(regex.match)),
("safe.regex", string_matcher_factory(safe_regex.match)),
("pcre2", pcre2_compiled_factory(jit=False)),
("pcre2 (jit)", pcre2_compiled_factory(jit=True)),
)
THREADED_LIBRARIES: Sequence[Tuple[str, MatcherFactory]] = (
("re", string_matcher_factory(re.match)),
("safe.regex", string_matcher_factory(safe_regex.match)),
("pcre2", pcre2_compiled_factory(jit=False)),
("pcre2 (jit)", pcre2_compiled_factory(jit=True)),
)
PATTERN_CASES: Sequence[Tuple[str, str, str]] = (
("simple", SIMPLE_PATTERN, SIMPLE_TEXT),
("named-groups", COMPLEX_PATTERN, COMPLEX_TEXT),
)
if __name__ == "__main__":
print(f"Single-thread loops per matcher: {SINGLE_THREAD_LOOPS}")
total_thread_ops = THREAD_COUNT * THREAD_LOOPS
os.environ["PYTHON_GIL"] = "0"
print(
f"Threaded loops: {THREAD_COUNT} threads x {THREAD_LOOPS} loops (PYTHON_GIL={os.environ['PYTHON_GIL']})"
)
for case_label, pattern, text in PATTERN_CASES:
print()
print(f"Pattern case: {case_label}")
print(f" expr: {pattern}")
single_rows = []
for label, matcher_factory in SINGLE_THREAD_LIBRARIES:
duration, ops_per_sec, matches = benchmark_single(
label, matcher_factory(pattern, text)
)
single_rows.append(
[label, f"{duration:.4f}", f"{ops_per_sec:,.0f}", matches, SINGLE_THREAD_LOOPS]
)
print("\n\nSingle-thread")
print(
tabulate(
single_rows,
headers=["Library", "Time (s)", "Ops/s", "Matches", "Total Ops"],
)
)
threaded_rows = []
for label, matcher_factory in THREADED_LIBRARIES:
duration, ops_per_sec, matches = benchmark_threaded(
label, matcher_factory(pattern, text)
)
threaded_rows.append(
[
label,
f"{duration:.4f}",
f"{ops_per_sec:,.0f}",
matches,
total_thread_ops,
]
)
print("\n\nThreaded")
print(
tabulate(
threaded_rows,
headers=["Library", "Time (s)", "Ops/s", "Matches", "Total Ops"],
)
) |
|
@Rocketknight1 @ArthurZucker Existing tokenizer code use # tokenization_qwen2.py
import regex as re
...
PRETOKENIZE_REGEX = r"""(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\r\n\p{L}\p{N}]?\p{L}+|\p{N}| ?[^\s\p{L}\p{N}]+[\r\n]*|\s*[\r\n]+|\s+(?!\S)|\s+"""Files using |
| @pytest.mark.xfail(strict=False, reason="Compiled regex still crashes under PYTHON_GIL=0") | ||
| def test_compiled_regex_thread_safety_crashes_under_gil0(): | ||
| script = _regex_thread_script( | ||
| imports="import regex", | ||
| setup_code=""" | ||
| compiled_pattern = regex.compile(pattern_text) | ||
| def match_once(): | ||
| return compiled_pattern.match(text_to_match) | ||
| """, | ||
| ) | ||
|
|
||
| result, message = _run_regex_thread_script("compiled regex", script) | ||
|
|
||
| if result.returncode == 0: | ||
| pytest.fail("compiled regex unexpectedly behaved thread-safely\n" + message) | ||
|
|
||
| if result.returncode == -11: | ||
| message += "\nProcess terminated with SIGSEGV (Segmentation fault)." | ||
|
|
||
| if message: | ||
| sys.stderr.write(message + "\n") | ||
| sys.stderr.flush() | ||
|
|
||
| pytest.fail(message) | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added new test to show regex is crashing with GIL=0 even when the pattern is not going through the caching mechanism. This is strange and unexpected. It appears regex lib is fully unsafe as each compiled pattern maybe holding or sharing with other pkg level persistent buffers or runtime stack.
|
Update benchmark adding Single-thread loops per matcher: 10000
Threaded loops: 8 threads x 10000 loops (PYTHON_GIL=0)
Pattern case: simple
expr: (Transformers|transformers|models|Models)
Single-thread
Library Time (s) Ops/s Matches Total Ops
----------- ---------- --------- --------- -----------
re 0.0059 1,683,494 10000 10000
regex 0.069 144,929 10000 10000
safe.regex 0.0623 160,490 10000 10000
pcre 0.0055 1,827,192 10000 10000
pcre (jit) 0.0057 1,759,222 10000 10000
pcre2 0.0159 629,413 10000 10000
pcre2 (jit) 0.0145 690,844 10000 10000
Threaded
Library Time (s) Ops/s Matches Total Ops
----------- ---------- --------- --------- -----------
re 0.109 734,058 80000 80000
safe.regex 1.7284 46,286 80000 80000
pcre 0.0528 1,516,316 80000 80000
pcre (jit) 0.0544 1,471,912 80000 80000
pcre2 0.1472 543,591 80000 80000
pcre2 (jit) 0.1346 594,521 80000 80000
Pattern case: named-groups
expr: (?P<head>\w+)\s+(?P<tail>\w+)
Single-thread
Library Time (s) Ops/s Matches Total Ops
----------- ---------- --------- --------- -----------
re 0.0193 517,076 10000 10000
regex 0.0825 121,173 10000 10000
safe.regex 0.0619 161,673 10000 10000
pcre 0.0055 1,815,004 10000 10000
pcre (jit) 0.0064 1,571,113 10000 10000
pcre2 0.0167 598,186 10000 10000
pcre2 (jit) 0.0153 653,871 10000 10000
Threaded
Library Time (s) Ops/s Matches Total Ops
----------- ---------- --------- --------- -----------
re 0.1167 685,416 80000 80000
safe.regex 1.7527 45,643 80000 80000
pcre 0.0528 1,515,403 80000 80000
pcre (jit) 0.0517 1,546,954 80000 80000
pcre2 0.123 650,669 80000 80000
pcre2 (jit) 0.1326 603,108 80000 80000update benchmark.py #!/usr/bin/env python
"""Micro benchmark comparing stdlib re with transformers.utils.safe.regex."""
import os
import re
import threading
import time
from typing import Callable, Sequence, Tuple
from tabulate import tabulate
from transformers.utils.safe import regex as safe_regex
import regex
import pcre2
try:
import pcre
except ImportError as exc: # pragma: no cover - optional dependency
raise ImportError(
"python-pcre is required for benchmark_re_vs_regex.py; install it with `pip install python-pcre`."
) from exc
SINGLE_THREAD_LOOPS = 10_000
THREAD_LOOPS = 10_000
THREAD_COUNT = 8
SIMPLE_PATTERN = r"(Transformers|transformers|models|Models)"
SIMPLE_TEXT = "Transformers by Hugging Face enable models across tasks and devices."
COMPLEX_PATTERN = r"(?P<head>\w+)\s+(?P<tail>\w+)"
COMPLEX_TEXT = "Transformers models"
def string_matcher_factory(
match_func: Callable[[str, str], object]
) -> Callable[[str, str], Callable[[], bool]]:
def factory(pattern: str, text: str) -> Callable[[], bool]:
def _match() -> bool:
return match_func(pattern, text) is not None
return _match
return factory
def pcre2_compiled_factory(jit: bool) -> Callable[[str, str], Callable[[], bool]]:
def factory(pattern: str, text: str) -> Callable[[], bool]:
compiled = _get_pcre2_pattern(pattern, jit=jit)
match = compiled.match
def _match() -> bool:
return match(text) is not None
return _match
return factory
def benchmark_single(label: str, match_callable: Callable[[], bool]):
start = time.perf_counter()
matches = 0
for _ in range(SINGLE_THREAD_LOOPS):
matches += match_callable()
duration = time.perf_counter() - start
ops_per_sec = SINGLE_THREAD_LOOPS / duration if duration else float("inf")
return duration, ops_per_sec, matches
def benchmark_threaded(label: str, match_callable: Callable[[], bool]):
total_matches = [0] * THREAD_COUNT
def worker(slot: int) -> None:
local_match = match_callable
count = 0
for _ in range(THREAD_LOOPS):
count += local_match()
total_matches[slot] = count
threads = [threading.Thread(target=worker, args=(idx,), name=f"{label}-worker-{idx}") for idx in range(THREAD_COUNT)]
start = time.perf_counter()
for thread in threads:
thread.start()
for thread in threads:
thread.join()
duration = time.perf_counter() - start
total_ops = THREAD_COUNT * THREAD_LOOPS
total = sum(total_matches)
ops_per_sec = total_ops / duration if duration else float("inf")
return duration, ops_per_sec, total
MatcherFactory = Callable[[str, str], Callable[[], bool]]
_PCRE_CACHE: dict[tuple[str, bool], pcre.Pattern] = {}
_PCRE_CACHE_LOCK = threading.Lock()
_PCRE2_CACHE: dict[tuple[str, bool], pcre2.Pattern] = {}
_PCRE2_CACHE_LOCK = threading.Lock()
def _get_pcre_pattern(pattern: str, *, jit: bool) -> pcre.Pattern:
key = (pattern, jit)
cached = _PCRE_CACHE.get(key)
if cached is not None:
return cached
with _PCRE_CACHE_LOCK:
cached = _PCRE_CACHE.get(key)
if cached is None:
cached = pcre.compile(pattern)
if jit:
cached.study(pcre.STUDY_JIT)
if hasattr(cached, "set_jit_stack"):
cached.set_jit_stack(32 * 1024, 512 * 1024)
_PCRE_CACHE[key] = cached
return cached
def _get_pcre2_pattern(pattern: str, *, jit: bool) -> pcre2.Pattern:
key = (pattern, jit)
cached = _PCRE2_CACHE.get(key)
if cached is not None:
return cached
with _PCRE2_CACHE_LOCK:
cached = _PCRE2_CACHE.get(key)
if cached is None:
cached = pcre2.compile(pattern, jit=jit)
_PCRE2_CACHE[key] = cached
return cached
def pcre_compiled_factory(jit: bool) -> Callable[[str, str], Callable[[], bool]]:
def factory(pattern: str, text: str) -> Callable[[], bool]:
compiled = _get_pcre_pattern(pattern, jit=jit)
match = compiled.match
def _match() -> bool:
return match(text) is not None
return _match
return factory
SINGLE_THREAD_LIBRARIES: Sequence[Tuple[str, MatcherFactory]] = (
("re", string_matcher_factory(re.match)),
("regex", string_matcher_factory(regex.match)),
("safe.regex", string_matcher_factory(safe_regex.match)),
("pcre", pcre_compiled_factory(jit=False)),
("pcre (jit)", pcre_compiled_factory(jit=True)),
("pcre2", pcre2_compiled_factory(jit=False)),
("pcre2 (jit)", pcre2_compiled_factory(jit=True)),
)
THREADED_LIBRARIES: Sequence[Tuple[str, MatcherFactory]] = (
("re", string_matcher_factory(re.match)),
("safe.regex", string_matcher_factory(safe_regex.match)),
("pcre", pcre_compiled_factory(jit=False)),
("pcre (jit)", pcre_compiled_factory(jit=True)),
("pcre2", pcre2_compiled_factory(jit=False)),
("pcre2 (jit)", pcre2_compiled_factory(jit=True)),
)
PATTERN_CASES: Sequence[Tuple[str, str, str]] = (
("simple", SIMPLE_PATTERN, SIMPLE_TEXT),
("named-groups", COMPLEX_PATTERN, COMPLEX_TEXT),
)
if __name__ == "__main__":
print(f"Single-thread loops per matcher: {SINGLE_THREAD_LOOPS}")
total_thread_ops = THREAD_COUNT * THREAD_LOOPS
os.environ["PYTHON_GIL"] = "0"
print(
f"Threaded loops: {THREAD_COUNT} threads x {THREAD_LOOPS} loops (PYTHON_GIL={os.environ['PYTHON_GIL']})"
)
for case_label, pattern, text in PATTERN_CASES:
print()
print(f"Pattern case: {case_label}")
print(f" expr: {pattern}")
single_rows = []
for label, matcher_factory in SINGLE_THREAD_LIBRARIES:
duration, ops_per_sec, matches = benchmark_single(
label, matcher_factory(pattern, text)
)
single_rows.append(
[label, f"{duration:.4f}", f"{ops_per_sec:,.0f}", matches, SINGLE_THREAD_LOOPS]
)
print("\n\nSingle-thread")
print(
tabulate(
single_rows,
headers=["Library", "Time (s)", "Ops/s", "Matches", "Total Ops"],
)
)
threaded_rows = []
for label, matcher_factory in THREADED_LIBRARIES:
duration, ops_per_sec, matches = benchmark_threaded(
label, matcher_factory(pattern, text)
)
threaded_rows.append(
[
label,
f"{duration:.4f}",
f"{ops_per_sec:,.0f}",
matches,
total_thread_ops,
]
)
print("\n\nThreaded")
print(
tabulate(
threaded_rows,
headers=["Library", "Time (s)", "Ops/s", "Matches", "Total Ops"],
)
) |
|
Interesting, and thank you for the comprehensive benchmarks! I'd have to discuss with the team, but I suspect our stance is that we want to support free-threading, but not at the cost of other PRs that have a high chance of breaking things. If we can do a drop-in replacement for That said, regex performance is usually not a bottleneck for tokenization, let alone LLM inference in general, so if |
Agreed. That's for a future (potential) PR and can be discussed with more detail later. I actually went over all the pcre wrappers and/or compat regex engine for python and each has problems or issues that I want to address in my own pcre pkg. They either are not maintained, compile pcre from source (ouch) and/or have usability/compat or GIL=0 issues. Let's just return our focus to this simpler PR instead which has no compat issues but inherit all the slowness of |
|
Yeah - based on your benchmarks (which were really helpful, thank you!), I think a good series of PRs would be:
I kind of changed my mind a bit - although I said initially that keeping existing patterns intact made more sense, big performance gains and removing the need for threadsafe wrappers might be worth altering a few regexes |
|
Either way, I think this is almost ready! @Qubitium can you check the CI errors? I think in some cases models have relative imports, but they're in Also, since this is a library-wide change that adds a wrapper to |
Relative paths import for the 2 deprecated tokenizers fixed.
@ArthurZucker @Cyrilvallez Ready for reviews. This is PR is a stop-gap, short term solution. |
Hold your thoughts. Just released |
|
Very interesting, but we might be wary of depending on a library that new, until we're confident that it's going to be maintained over time! Still, if it's straightforward bindings to a well-maintained lib and it solves all the regex problems, we might consider it. Definitely leaving that one to the core maintainers, though 😅 |
Challenge accepted. lol. pypcre v0.2.0 just dropped. Not only is it full gil=0 compliant, it is also much faster than anything out there. The more thread you use, the faster our advantege. Btw, I know this is getting out of topic but since it is technically relevant I will drop this: After looking Both
However, despite |
|
Hey! After some internal discussion, people aren't ready to depend on I think the plan we want is this series of PRs instead:
Even at the cost of performance in some cases, depending on the internal lib maximizes future support and minimizes maintenance and dependency headaches we have to worry about later. |
Challenge accepted. Now is not the time or place to fight this good fight but Transformers will come back to pypcre. =) The time will come sooner rather tha later. I will sit on the rock like master turtle and wait for the tea laves to drop (they are about to).
Agreed. This is the best short term fix.
I would not recommend this. On another tangent but GIL=0 relatd. ALL of transformers that use triton kernels are not GIL=0 safe. I tried to use Fix autotune thread safety (crash) under GIL=0 PR #8437 so maybe if you guys can help secondarily review/validate this PR for me that will be great too as it affects transformers as much as triton. Newer models ones like qwen3-next which recommends flash linear attention and demands triton. Also any model that uses triton kernels with autoune enabled are affected. @Rocketknight1 Can you do final review on your end and check if there is anything else needed for this PR? I see that the github status is showing you have been requested as reviewer but completed review of this PR. I know other's also need to review this but good to get as much reviewer onboard in the mean time. |
|
Hmmn, I'm unsure! I think we'd like to pause this PR for the moment and start by moving regexes to Would you be willing to leave this for the moment and make a separate PR to switch files to |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: bart, bertweet, blenderbot, blenderbot_small, clip, clvp, codegen, ctrl, deberta, deepseek_vl, deepseek_vl_hybrid, depth_pro, fastspeech2_conformer, got_ocr2, gpt2, gpt_oss |
Sure. Let's pause this PR until you guys decide. Please assign someone to do the |
What does this PR do?
Adds (Full) Regex and (Partial) Tokenization GIL=0 free-threading support. Tested up to Python 3.14T.
In simple terms, Transformers code that relies on regex will segfault under true concurrency. I have confirmed with the regex maintainer that the latest GIL=0–compatible regex package is not GIL=0 safe. This has been corroborated by the submitted unit test in this PR, which demonstrates the segfault. Regex caches and reuses compiled patterns internally and executes them, which makes it inherently unsafe without the GIL.
The core idea is not to make the entire Transformers library GIL=0 safe all at once, but rather to adapt it piece by piece until everything is as GIL=0 compatible as possible.
This PR wraps existing regex calls into a thread-locked, protected serial execution pipeline. Files that import regex only need to adjust their import path to
from util.safe import regexto minimize migration pain.The current safe wrapper is modular and can be extended to cover additional modules in the future as needed or as issues are discovered.
Since Transformers evolves more rapidly than most packages, introducing a local wrapper may be a pragmatic step while waiting for an upstream fix, which may be delayed. Additionally, many if not most Python APIs are not inherently thread-safe, so applying local wrappers is often necessary beyond just external packages.”
Two new unit test files are included:
A non-crashing test suite that proves the code works under GIL=0 with thread load.
A crash-demonstration test that proves regex code paths segfault without the added protection under thread load.
utils/safe.py
tests/utils/test_safe.py
tests/utils/test_safe_crash.py
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@ArthurZucker @itazap @SunMarc @Cyrilvallez @gante @ydshieh @stevhliu