Add priority-based truncation by ultmaster · Pull Request #63 · microsoft/poml

ultmaster · 2025-07-29T08:23:39Z

Summary

support priority-driven trimming when concatenating markdown boxes
propagate truncation options from elements and reduce boxes by char/token limits
update writer tests to cover token limits and priority-based trimming
extend end-to-end tests for token limit cases

Testing

npm run build-webview
npm run build-cli
npm run lint
npm test
python -m pytest python/tests

https://chatgpt.com/codex/tasks/task_e_68885f74f82c832eabbfa758276b2fa5

Copilot

Pull Request Overview

This PR adds priority-based truncation support to the POML writer system, enabling intelligent content reduction when concatenating markdown boxes based on character and token limits.

Adds truncation capabilities with configurable direction, markers, and token encoding models
Implements priority-based box removal when content exceeds limits
Extends both MarkdownWriter and FreeWriter to support char-limit, token-limit, and priority attributes

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File	Description
packages/poml/writer.ts	Core implementation of truncation logic, priority-based reduction, and tokenizer integration
packages/poml/tests/writer.test.tsx	Unit tests covering character limits, token limits, and priority-based truncation scenarios
packages/poml/tests/index.test.tsx	End-to-end tests validating truncation features through the full POML pipeline
packages/poml/base.tsx	Interface additions for charLimit, tokenLimit, and priority props

Comments suppressed due to low confidence (2)

packages/poml/tests/index.test.tsx:25

The attribute name 'charLimit' in the test uses camelCase, but the implementation expects 'char-limit' (kebab-case). This inconsistency could cause confusion.

    const text = '<p charLimit="4">abcdefg</p>';

packages/poml/tests/index.test.tsx:31

Similar to charLimit, 'tokenLimit' uses camelCase while the implementation expects 'token-limit' (kebab-case).

    const text = '<p tokenLimit="1">hello world</p>';

Copilot · 2025-07-29T08:24:54Z

packages/poml/writer.ts

+    if (tokenLimit !== undefined) {
+      let enc = this.tokenizerCache[tokenEncodingModel];
+      if (!enc) {
+        enc = encodingForModel(tokenEncodingModel as any);


The 'as any' type assertion bypasses TypeScript's type checking for the tokenEncodingModel parameter. Consider validating the model name against a known list of supported models or using proper typing.

Suggested change

enc = encodingForModel(tokenEncodingModel as any);

if (!Object.values(SupportedModels).includes(tokenEncodingModel)) {

throw new Error(`Unsupported token encoding model: ${tokenEncodingModel}`);

}

enc = encodingForModel(tokenEncodingModel);

Copilot · 2025-07-29T08:24:55Z

packages/poml/writer.ts

+    const tokenModel = (this.options as any).tokenEncodingModel || 'gpt-3.5-turbo';
+    const getTokenLength = (t: string) => {
+      if (tokenLimit === undefined) {
+        return 0;
+      }
+      let enc = this.tokenizerCache[tokenModel];
+      if (!enc) {
+        enc = encodingForModel(tokenModel as any);


Similar to line 117, this 'as any' type assertion bypasses type safety. The same validation approach should be applied here.

Suggested change

const tokenModel = (this.options as any).tokenEncodingModel || 'gpt-3.5-turbo';

const getTokenLength = (t: string) => {

if (tokenLimit === undefined) {

return 0;

}

let enc = this.tokenizerCache[tokenModel];

if (!enc) {

enc = encodingForModel(tokenModel as any);

const validModels = ['gpt-3.5-turbo', 'gpt-4', 'gpt-3'] as const;

type TokenModel = typeof validModels[number];

const tokenModel = validModels.includes((this.options?.tokenEncodingModel as string) ?? '')

? (this.options.tokenEncodingModel as TokenModel)

: 'gpt-3.5-turbo';

const getTokenLength = (t: string) => {

if (tokenLimit === undefined) {

return 0;

}

let enc = this.tokenizerCache[tokenModel];

if (!enc) {

enc = encodingForModel(tokenModel);

Copilot · 2025-07-29T08:24:55Z

packages/poml/writer.ts

+      const minP = Math.min(...priorities);
+      if (current.every(b => (b.priority ?? 0) === minP)) {


This condition will break the loop when all remaining boxes have the same priority, but this means content may still exceed limits. Consider adding a comment explaining this behavior or implementing a fallback strategy.

Suggested change

const minP = Math.min(...priorities);

if (current.every(b => (b.priority ?? 0) === minP)) {

const minP = Math.min(...priorities);

// If all remaining boxes have the same priority, break the loop.

// However, this may result in content exceeding the limits.

if (current.every(b => (b.priority ?? 0) === minP)) {

// Fallback strategy: Remove boxes from the end until limits are met.

while (

(charLimit !== undefined && totalChars(current) > charLimit) ||

(tokenLimit !== undefined && totalTokens(current) > tokenLimit)

) {

current.pop();

}

Copilot · 2025-07-29T08:24:55Z

packages/poml/writer.ts

+      if (truncateDirection === 'start') {
+        truncated = truncated.slice(truncated.length - charLimit);
+      } else if (truncateDirection === 'middle') {
+        const head = Math.ceil(charLimit / 2);
+        const tail = charLimit - head;
+        truncated = truncated.slice(0, head) + truncated.slice(truncated.length - tail);
+      } else {
+        truncated = truncated.slice(0, charLimit);
+      }


The character limit truncation logic is duplicated with slight variations in the token limit section (lines 123-131). Consider extracting this logic into a helper function to reduce code duplication.

Suggested change

if (truncateDirection === 'start') {

truncated = truncated.slice(truncated.length - charLimit);

} else if (truncateDirection === 'middle') {

const head = Math.ceil(charLimit / 2);

const tail = charLimit - head;

truncated = truncated.slice(0, head) + truncated.slice(truncated.length - tail);

} else {

truncated = truncated.slice(0, charLimit);

}

truncated = this.truncateByDirection(

truncated,

charLimit,

truncateDirection,

(str, start, end) => str.slice(start, end)

);

Refine priority-based truncation

6e67ad5

Copilot AI review requested due to automatic review settings July 29, 2025 08:23

ultmaster added the codex label Jul 29, 2025 — with ChatGPT Codex Connector

Copilot AI reviewed Jul 29, 2025

View reviewed changes

ultmaster merged commit a91edf2 into main Jul 29, 2025
3 checks passed

ultmaster deleted the codex/add-charlimit-and-tokenlimit-to-propsbase branch August 27, 2025 00:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add priority-based truncation#63

Add priority-based truncation#63
ultmaster merged 1 commit intomainfrom
codex/add-charlimit-and-tokenlimit-to-propsbase

ultmaster commented Jul 29, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 29, 2025

Uh oh!

Copilot AI Jul 29, 2025

Uh oh!

Copilot AI Jul 29, 2025

Uh oh!

Copilot AI Jul 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        enc = encodingForModel(tokenEncodingModel as any);
+        if (!Object.values(SupportedModels).includes(tokenEncodingModel)) {
+          throw new Error(`Unsupported token encoding model: ${tokenEncodingModel}`);
+        }
+        enc = encodingForModel(tokenEncodingModel);

		const minP = Math.min(...priorities);
		if (current.every(b => (b.priority ?? 0) === minP)) {

-      const minP = Math.min(...priorities);
-      if (current.every(b => (b.priority ?? 0) === minP)) {
+      const minP = Math.min(...priorities);
+      // If all remaining boxes have the same priority, break the loop.
+      // However, this may result in content exceeding the limits.
+      if (current.every(b => (b.priority ?? 0) === minP)) {
+        // Fallback strategy: Remove boxes from the end until limits are met.
+        while (
+          (charLimit !== undefined && totalChars(current) > charLimit) ||
+          (tokenLimit !== undefined && totalTokens(current) > tokenLimit)
+        ) {
+          current.pop();
+        }

Conversation

ultmaster commented Jul 29, 2025

Summary

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants