Skip to content

test(dialect): strengthen compression_stats and count_tokens assertions#150

Closed
mvalentsev wants to merge 1 commit intoMemPalace:developfrom
mvalentsev:fix/dialect-test-assertions
Closed

test(dialect): strengthen compression_stats and count_tokens assertions#150
mvalentsev wants to merge 1 commit intoMemPalace:developfrom
mvalentsev:fix/dialect-test-assertions

Conversation

@mvalentsev
Copy link
Copy Markdown
Contributor

@mvalentsev mvalentsev commented Apr 7, 2026

What does this PR do?

The main honest-stats test fix (ratio -> size_ratio, compressed_chars -> summary_chars, len // 3 -> word-based count_tokens) already landed in main through @igorls' PR series. This PR now just adds a couple of assertions on top of that fix so the regression surface is a bit smaller.

  • test_stats also asserts stats["original_tokens_est"] > stats["summary_tokens_est"]. That catches a class of regressions where the token estimator flattens to a constant, which would otherwise slip through the existing char-level assertion.
  • test_count_tokens gains edge cases for the empty string and the single-word input. Both exercise the max(1, ...) guard in Dialect.count_tokens, so a future refactor that drops the guard fails loudly instead of silently returning 0.

Originally opened before I saw that the core rename fix was already in flight. After rebasing on the merged fix, the remaining diff is a small additive coverage improvement.

How to test

python -m pytest tests/test_dialect.py -v
ruff check tests/test_dialect.py
ruff format --check tests/test_dialect.py

18 tests pass locally.

Checklist

  • Tests pass
  • Linter passes
  • Python 3.9 compatible

@adv3nt3
Copy link
Copy Markdown
Contributor

adv3nt3 commented Apr 7, 2026

@bensig @proxysoul PR #147 changed the compression_stats API (ratiosize_ratio, compressed_charssummary_chars, word-based count_tokens) but the tests from PR #131 weren't updated to match — this broke CI on main (68e3414). PR #150 is a 6-line fix for exactly those 2 stale assertions. Currently blocking CI for all open PRs.

@mvalentsev mvalentsev force-pushed the fix/dialect-test-assertions branch from 79e1647 to 794dc81 Compare April 7, 2026 23:28
@mvalentsev mvalentsev changed the title test(dialect): update assertions for new honest-stats API test(dialect): strengthen compression_stats and count_tokens assertions Apr 7, 2026
@mvalentsev
Copy link
Copy Markdown
Contributor Author

Update: the core rename fix landed in main via @igorls' PR series, so this is no longer a CI blocker. Rebased on the merged fix, and the remaining diff is just two extra assertions — a strictness check on original_tokens_est > summary_tokens_est in test_stats, and empty / single-word edge cases for the max(1, ...) guard in test_count_tokens. Retitled and updated the description to reflect the new scope.

@mvalentsev mvalentsev force-pushed the fix/dialect-test-assertions branch 7 times, most recently from 43d0e18 to 25e691e Compare April 10, 2026 15:49
@mvalentsev mvalentsev force-pushed the fix/dialect-test-assertions branch from 25e691e to 9c0da51 Compare April 10, 2026 16:40
Copy link
Copy Markdown

@web3guru888 web3guru888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #150 — test(dialect): strengthen compression_stats and count_tokens assertions

This is a small, focused test hardening PR that adds meaningful coverage on top of the existing assertion suite. The context in the PR description is helpful: the ratio → size_ratio / compressed_chars → summary_chars rename landed via @igorls' series, and this PR bolts on the regression guards that should have accompanied that rename from the start.

What's done well:

The original_tokens_est > summary_tokens_est assertion in test_stats is the right catch. The risk it guards against — a token estimator that collapses to a constant (e.g. always returns 1 due to a bad max() guard or wrong heuristic multiplier) — would be completely invisible to the existing char-level assertion. It's a silent failure mode that could persist through several refactor cycles without this.

The three count_tokens edge cases ("", "one", "hello world") directly exercise the max(1, int(word_count * 1.3)) contract. The inline comment explaining the "hello world" → 2 calculation is a nice touch — it makes the expected value self-documenting for future readers rather than leaving them to mentally trace the heuristic. Not every test PR does this.

Issues / concerns:

One minor flakiness risk: test_stats creates a compression pair and then asserts original_tokens_est > summary_tokens_est. If Dialect.compress() ever produces a summary longer than the original (degenerate case on very short inputs — e.g., a 2-word input that gets a 5-word summary header prepended), the assertion fails for the wrong reason. It might be worth using a clearly long fixture string (50+ words) so the compressed form is unambiguously shorter. The current fixture text is not shown in the diff, so I can't verify it's safe — worth double-checking.

The count_tokens("") == 1 case is correct per the max(1, ...) guard, but it implicitly documents that empty strings always return 1. If the semantics ever change (e.g., returning 0 for empty input becomes intentional), the test will catch it — but it might be surprising. A one-line comment explaining "empty string returns minimum 1 by design" would clarify the intent vs. a floor that was accidentally left in.

Overall verdict:

LGTM — this is exactly the kind of regression guard that makes a codebase durable through refactors. The token estimator path is load-bearing for anything that uses compression_stats to make throttling or batching decisions, so having explicit assertions on it is worthwhile. The edge-case coverage on count_tokens is appropriate for a max(1, ...) guard. Minor suggestion: verify the fixture text in test_stats is long enough that summary_tokens_est < original_tokens_est is guaranteed, not probabilistic.


Reviewed by MemPalace-AGI — autonomous research system with perfect memory

@web3guru888
Copy link
Copy Markdown

Small but correct. The two new assertions are the right additions:

  • assert stats["original_tokens_est"] > stats["summary_tokens_est"] catches the "flattened estimator" regression class (where token counting returns a constant regardless of input). The size_ratio > 1 assertion above it doesn't catch this because compression_stats() computes ratio from char counts, not token counts.
  • count_tokens("") == 1 and count_tokens("one") == 1 pin the max(1, ...) floor behavior explicitly — useful to have documented in the test so the floor semantics don't get removed accidentally.

The comment explaining the word-based heuristic (~1.3 tokens per word) is a helpful clarification for contributors who might otherwise wonder why a 2-word string returns 2 tokens instead of the expected ceil(2 * 1.3) = 3.

LGTM.

@mvalentsev mvalentsev force-pushed the fix/dialect-test-assertions branch from 9c0da51 to 93ff655 Compare April 11, 2026 11:25
@bensig bensig changed the base branch from main to develop April 11, 2026 22:23
@bensig bensig requested a review from igorls as a code owner April 11, 2026 22:23
@mvalentsev mvalentsev force-pushed the fix/dialect-test-assertions branch from 93ff655 to ec8c6df Compare April 11, 2026 23:46
Adds coverage on top of the honest-stats test fix that already landed
in main:

- test_stats now also asserts original_tokens_est is strictly greater
  than summary_tokens_est, which catches a class of regressions where
  the token estimator flattens to a constant.
- test_count_tokens gains edge cases for the empty string and the
  single-word input. Both exercise the max(1, ...) guard in
  Dialect.count_tokens, so a future refactor that drops the guard
  fails loudly instead of silently returning 0.
@mvalentsev mvalentsev force-pushed the fix/dialect-test-assertions branch from ec8c6df to 1a79095 Compare April 12, 2026 06:49
@bensig
Copy link
Copy Markdown
Collaborator

bensig commented Apr 12, 2026

closing — dialect test improvements landed via #569 (merged). thanks @mvalentsev!

@bensig bensig closed this Apr 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants