Skip to content

Commit 0740fb8

Browse files
feat(feishu): add markdown tables, positional insert, color_text, and table ops (#29411)
* feat(feishu): add markdown tables, insert, color_text, table ops, and image fixes Extends feishu_doc on top of #20304 with capabilities that are not yet covered: Markdown → native table rendering: - write/append now use the Descendant API instead of Children API, enabling GFM markdown tables (block_type 31/32) to render as native Feishu tables automatically - Adaptive column widths calculated from cell content (CJK chars 2x weight) - Batch insertion for large documents (>1000 blocks, docx-batch-insert.ts) New actions: - insert: positional markdown insertion after a given block_id - color_text: apply color/bold to a text block via [red]...[/red] markup - insert_table_row / insert_table_column: add rows or columns to a table - delete_table_rows / delete_table_columns: remove rows or columns - merge_table_cells: merge a rectangular cell range Image upload fixes (affects write, append, and upload_image): - upload_image now accepts data URI and plain base64 in addition to url/file_path, covering DALL-E b64_json, canvas screenshots, etc. - Fix: pass Buffer directly to drive.media.uploadAll instead of Readable.from(), which caused Content-Length mismatch for large images - Fix: same Readable bug fixed in upload_file - Fix: pass drive_route_token via extra field for correct multi-datacenter routing (per API docs: required when parent_node is a document block ID) * fix(feishu): add documentBlockDescendant mock to docx.test.ts write/append now use the Descendant API (documentBlockDescendant.create) instead of Children API. The existing test mock was missing this SDK method, causing processImages to never be reached and fetchRemoteMedia to go uncalled. Added blockDescendantCreateMock returning an image block so the 'skips image upload when markdown image URL is blocked' test flows through processImages as expected. * fix(feishu): address bot review feedback - resolveUploadInput: remove length < 1024 guard on file path detection. Prefix patterns (isAbsolute / ~ / ./ / ../) already correctly distinguish file paths from base64 strings at any length. The old guard caused file paths ≥1024 chars to fall through to the base64 branch incorrectly. - parseColorMarkup: add comment clarifying that mismatched closing tags (e.g. [red]text[/green]) are intentional — opening tag style is applied, closing tag is consumed regardless of name. * fix(feishu): address second-round codex bot review feedback P1 - Reject single oversized subtrees in batch insert (docx-batch-insert.ts): A first-level block whose descendant count exceeds BATCH_SIZE (1000) cannot be split atomically (e.g. a very large table). Previously such a block was silently added to the current batch and sent as an oversized request, violating the API limit. Now throws a descriptive error so callers know to reduce the content size. P2 - Preserve unmatched brackets in color markup parser (docx-color-text.ts): Text like 'Revenue [Q1] up' contains a bracket pair with no matching '[/...]' closer. The original regex dropped the '[' character in this case, silently corrupting the text. Fixed by appending '|\[' to the plain-text alternative so any '[' that does not open a complete tag is captured as literal text. * fix(feishu): address third-round codex bot review feedback P2 - Throw ENOENT for non-existing absolute image paths (docx.ts): Previously a non-existing absolute path like /tmp/missing.png fell through to Buffer.from(..., 'base64') and uploaded garbage bytes. Now throws a descriptive ENOENT error and hints at data URI format for callers intending to pass JPEG binary data (which starts with /9j/). P2 - Fail clearly when insert anchor block is not found (docx.ts): insertDoc previously set insertIndex to -1 (append) when after_block_id was absent from the parent's child list, silently inserting at the wrong position. Two fixes: 1. Paginate through all children (documentBlockChildren.get returns up to 200 per page) before searching for the anchor. 2. Throw a descriptive error if after_block_id is still not found after full pagination, instead of silently falling back to append. * fix(feishu): address fourth-round codex bot review feedback - Enforce mutual exclusivity across all three upload sources (url, file_path, image): throw immediately when more than one is provided, instead of silently preferring the image branch and ignoring the others. - Validate plain base64 payloads before decoding: reject strings that contain characters outside the standard base64 alphabet ([A-Za-z0-9+/=]) so that malformed inputs fail fast with a clear error rather than decoding to garbage bytes and producing an opaque Feishu API failure downstream. Also throw if the decoded buffer is empty. * fix(feishu): address fifth-round codex bot review feedback - parseColorMarkup: restrict opening tag regex to known colour/style names (bg:*, bold, red, orange, yellow, green, blue, purple, grey/gray) so that ordinary bracket tokens like [Q1] can no longer consume a subsequent real closing tag ([/red]) and corrupt the surrounding styled spans. Unknown tags now fall through to the plain-text alternatives and are emitted literally. - resolveUploadInput: estimate decoded byte count from base64 input length (ceil(len * 3 / 4)) BEFORE allocating the full Buffer, preventing oversized payloads from spiking memory before the maxBytes limit is enforced. Applies to both the data-URI branch and the plain-base64 branch. * fix(feishu): address sixth-round codex bot review feedback - docx-table-ops: apply MIN/MAX_COLUMN_WIDTH clamping in the empty-table branch so tables with 15+ columns don't produce sub-50 widths that Feishu rejects as invalid column_width values. - docx.ts (data URI branch): validate the ';base64' marker before decoding so plain/URL-encoded data URIs are rejected with a clear error; also validate the payload against the base64 alphabet (same guard already applied in the plain-base64 branch) so malformed inputs fail fast rather than producing opaque downstream Feishu errors. * Feishu: align docx descendant insertion tests and changelog --------- Co-authored-by: Tak Hoffman <[email protected]>
1 parent 4ad49de commit 0740fb8

File tree

7 files changed

+1082
-58
lines changed

7 files changed

+1082
-58
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,7 @@ Docs: https://docs.openclaw.ai
7878
- FS/Sandbox workspace boundaries: add a dedicated `outside-workspace` safe-open error code for root-escape checks, and propagate specific outside-workspace messages across edit/browser/media consumers instead of generic not-found/invalid-path fallbacks. (#29715) Thanks @YuzuruS.
7979
- Config/Doctor group allowlist diagnostics: align `groupPolicy: "allowlist"` warnings with per-channel runtime semantics by excluding Google Chat sender-list checks and by warning when no-fallback channels (for example iMessage) omit `groupAllowFrom`, with regression coverage. (#28477) Thanks @tonydehnke.
8080
- Onboarding/Custom providers: use Azure OpenAI-specific verification auth/payload shape (`api-key`, deployment-path chat completions payload) when probing Azure endpoints so valid Azure custom-provider setup no longer fails preflight. (#29421) Thanks @kunalk16.
81+
- Feishu/Docx editing tools: add `feishu_doc` positional insert, table row/column operations, table-cell merge, and color-text updates; switch markdown write/append/insert to Descendant API insertion with large-document batching; and harden image uploads for data URI/base64/local-path inputs with strict validation and routing-safe upload metadata. (#29411) Thanks @Elarwei001.
8182

8283
## 2026.2.26
8384

extensions/feishu/src/doc-schema.ts

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,14 @@ export const FeishuDocSchema = Type.Union([
1717
doc_token: Type.String({ description: "Document token" }),
1818
content: Type.String({ description: "Markdown content to append to end of document" }),
1919
}),
20+
Type.Object({
21+
action: Type.Literal("insert"),
22+
doc_token: Type.String({ description: "Document token" }),
23+
content: Type.String({ description: "Markdown content to insert" }),
24+
after_block_id: Type.String({
25+
description: "Insert content after this block ID. Use list_blocks to find block IDs.",
26+
}),
27+
}),
2028
Type.Object({
2129
action: Type.Literal("create"),
2230
title: Type.String({ description: "Document title" }),
@@ -50,6 +58,7 @@ export const FeishuDocSchema = Type.Union([
5058
doc_token: Type.String({ description: "Document token" }),
5159
block_id: Type.String({ description: "Block ID" }),
5260
}),
61+
// Table creation (explicit structure)
5362
Type.Object({
5463
action: Type.Literal("create_table"),
5564
doc_token: Type.String({ description: "Document token" }),
@@ -91,11 +100,60 @@ export const FeishuDocSchema = Type.Union([
91100
minItems: 1,
92101
}),
93102
}),
103+
// Table row/column manipulation
104+
Type.Object({
105+
action: Type.Literal("insert_table_row"),
106+
doc_token: Type.String({ description: "Document token" }),
107+
block_id: Type.String({ description: "Table block ID" }),
108+
row_index: Type.Optional(
109+
Type.Number({ description: "Row index to insert at (-1 for end, default: -1)" }),
110+
),
111+
}),
112+
Type.Object({
113+
action: Type.Literal("insert_table_column"),
114+
doc_token: Type.String({ description: "Document token" }),
115+
block_id: Type.String({ description: "Table block ID" }),
116+
column_index: Type.Optional(
117+
Type.Number({ description: "Column index to insert at (-1 for end, default: -1)" }),
118+
),
119+
}),
120+
Type.Object({
121+
action: Type.Literal("delete_table_rows"),
122+
doc_token: Type.String({ description: "Document token" }),
123+
block_id: Type.String({ description: "Table block ID" }),
124+
row_start: Type.Number({ description: "Start row index (0-based)" }),
125+
row_count: Type.Optional(Type.Number({ description: "Number of rows to delete (default: 1)" })),
126+
}),
127+
Type.Object({
128+
action: Type.Literal("delete_table_columns"),
129+
doc_token: Type.String({ description: "Document token" }),
130+
block_id: Type.String({ description: "Table block ID" }),
131+
column_start: Type.Number({ description: "Start column index (0-based)" }),
132+
column_count: Type.Optional(
133+
Type.Number({ description: "Number of columns to delete (default: 1)" }),
134+
),
135+
}),
136+
Type.Object({
137+
action: Type.Literal("merge_table_cells"),
138+
doc_token: Type.String({ description: "Document token" }),
139+
block_id: Type.String({ description: "Table block ID" }),
140+
row_start: Type.Number({ description: "Start row index" }),
141+
row_end: Type.Number({ description: "End row index (exclusive)" }),
142+
column_start: Type.Number({ description: "Start column index" }),
143+
column_end: Type.Number({ description: "End column index (exclusive)" }),
144+
}),
145+
// Image / file upload
94146
Type.Object({
95147
action: Type.Literal("upload_image"),
96148
doc_token: Type.String({ description: "Document token" }),
97149
url: Type.Optional(Type.String({ description: "Remote image URL (http/https)" })),
98150
file_path: Type.Optional(Type.String({ description: "Local image file path" })),
151+
image: Type.Optional(
152+
Type.String({
153+
description:
154+
"Image as data URI (data:image/png;base64,...) or plain base64 string. Use instead of url/file_path for DALL-E outputs, canvas screenshots, etc.",
155+
}),
156+
),
99157
parent_block_id: Type.Optional(
100158
Type.String({ description: "Parent block ID (default: document root)" }),
101159
),
@@ -117,6 +175,16 @@ export const FeishuDocSchema = Type.Union([
117175
),
118176
filename: Type.Optional(Type.String({ description: "Optional filename override" })),
119177
}),
178+
// Text color / style
179+
Type.Object({
180+
action: Type.Literal("color_text"),
181+
doc_token: Type.String({ description: "Document token" }),
182+
block_id: Type.String({ description: "Text block ID to update" }),
183+
content: Type.String({
184+
description:
185+
'Text with color markup. Tags: [red], [green], [blue], [orange], [yellow], [purple], [grey], [bold], [bg:yellow]. Example: "Revenue [green]+15%[/green] YoY"',
186+
}),
187+
}),
120188
]);
121189

122190
export type FeishuDocParams = Static<typeof FeishuDocSchema>;
Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
/**
2+
* Batch insertion for large Feishu documents (>1000 blocks).
3+
*
4+
* The Feishu Descendant API has a limit of 1000 blocks per request.
5+
* This module handles splitting large documents into batches while
6+
* preserving parent-child relationships between blocks.
7+
*/
8+
9+
import type * as Lark from "@larksuiteoapi/node-sdk";
10+
import { cleanBlocksForDescendant } from "./docx-table-ops.js";
11+
12+
export const BATCH_SIZE = 1000; // Feishu API limit per request
13+
14+
type Logger = { info?: (msg: string) => void };
15+
16+
/**
17+
* Collect all descendant blocks for a given set of first-level block IDs.
18+
* Recursively traverses the block tree to gather all children.
19+
*/
20+
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- SDK block types
21+
function collectDescendants(blocks: any[], firstLevelIds: string[]): any[] {
22+
const blockMap = new Map<string, any>();
23+
for (const block of blocks) {
24+
blockMap.set(block.block_id, block);
25+
}
26+
27+
const result: any[] = [];
28+
const visited = new Set<string>();
29+
30+
function collect(blockId: string) {
31+
if (visited.has(blockId)) return;
32+
visited.add(blockId);
33+
34+
const block = blockMap.get(blockId);
35+
if (!block) return;
36+
37+
result.push(block);
38+
39+
// Recursively collect children
40+
const children = block.children;
41+
if (Array.isArray(children)) {
42+
for (const childId of children) {
43+
collect(childId);
44+
}
45+
} else if (typeof children === "string") {
46+
collect(children);
47+
}
48+
}
49+
50+
for (const id of firstLevelIds) {
51+
collect(id);
52+
}
53+
54+
return result;
55+
}
56+
57+
/**
58+
* Insert a single batch of blocks using Descendant API.
59+
*
60+
* @param parentBlockId - Parent block to insert into (defaults to docToken)
61+
* @param index - Position within parent's children (-1 = end)
62+
*/
63+
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- SDK block types
64+
async function insertBatch(
65+
client: Lark.Client,
66+
docToken: string,
67+
blocks: any[],
68+
firstLevelBlockIds: string[],
69+
parentBlockId: string = docToken,
70+
index: number = -1,
71+
): Promise<any[]> {
72+
const descendants = cleanBlocksForDescendant(blocks);
73+
74+
if (descendants.length === 0) {
75+
return [];
76+
}
77+
78+
const res = await client.docx.documentBlockDescendant.create({
79+
path: { document_id: docToken, block_id: parentBlockId },
80+
data: {
81+
children_id: firstLevelBlockIds,
82+
descendants,
83+
index,
84+
},
85+
});
86+
87+
if (res.code !== 0) {
88+
throw new Error(`${res.msg} (code: ${res.code})`);
89+
}
90+
91+
return res.data?.children ?? [];
92+
}
93+
94+
/**
95+
* Insert blocks in batches for large documents (>1000 blocks).
96+
*
97+
* Batches are split to ensure BOTH children_id AND descendants
98+
* arrays stay under the 1000 block API limit.
99+
*
100+
* @param client - Feishu API client
101+
* @param docToken - Document ID
102+
* @param blocks - All blocks from Convert API
103+
* @param firstLevelBlockIds - IDs of top-level blocks to insert
104+
* @param logger - Optional logger for progress updates
105+
* @param parentBlockId - Parent block to insert into (defaults to docToken = document root)
106+
* @param startIndex - Starting position within parent (-1 = end). For multi-batch inserts,
107+
* each batch advances this by the number of first-level IDs inserted so far.
108+
* @returns Inserted children blocks and any skipped block IDs
109+
*/
110+
// eslint-disable-next-line @typescript-eslint/no-explicit-any -- SDK block types
111+
export async function insertBlocksInBatches(
112+
client: Lark.Client,
113+
docToken: string,
114+
blocks: any[],
115+
firstLevelBlockIds: string[],
116+
logger?: Logger,
117+
parentBlockId: string = docToken,
118+
startIndex: number = -1,
119+
): Promise<{ children: any[]; skipped: string[] }> {
120+
const allChildren: any[] = [];
121+
122+
// Build batches ensuring each batch has ≤1000 total descendants
123+
const batches: { firstLevelIds: string[]; blocks: any[] }[] = [];
124+
let currentBatch: { firstLevelIds: string[]; blocks: any[] } = { firstLevelIds: [], blocks: [] };
125+
const usedBlockIds = new Set<string>();
126+
127+
for (const firstLevelId of firstLevelBlockIds) {
128+
const descendants = collectDescendants(blocks, [firstLevelId]);
129+
const newBlocks = descendants.filter((b) => !usedBlockIds.has(b.block_id));
130+
131+
// A single block whose subtree exceeds the API limit cannot be split
132+
// (a table or other compound block must be inserted atomically).
133+
if (newBlocks.length > BATCH_SIZE) {
134+
throw new Error(
135+
`Block "${firstLevelId}" has ${newBlocks.length} descendants, which exceeds the ` +
136+
`Feishu API limit of ${BATCH_SIZE} blocks per request. ` +
137+
`Please split the content into smaller sections.`,
138+
);
139+
}
140+
141+
// If adding this first-level block would exceed limit, start new batch
142+
if (
143+
currentBatch.blocks.length + newBlocks.length > BATCH_SIZE &&
144+
currentBatch.blocks.length > 0
145+
) {
146+
batches.push(currentBatch);
147+
currentBatch = { firstLevelIds: [], blocks: [] };
148+
}
149+
150+
// Add to current batch
151+
currentBatch.firstLevelIds.push(firstLevelId);
152+
for (const block of newBlocks) {
153+
currentBatch.blocks.push(block);
154+
usedBlockIds.add(block.block_id);
155+
}
156+
}
157+
158+
// Don't forget the last batch
159+
if (currentBatch.blocks.length > 0) {
160+
batches.push(currentBatch);
161+
}
162+
163+
// Insert each batch, advancing index for position-aware inserts.
164+
// When startIndex == -1 (append to end), each batch appends after the previous.
165+
// When startIndex >= 0, each batch starts at startIndex + count of first-level IDs already inserted.
166+
let currentIndex = startIndex;
167+
for (let i = 0; i < batches.length; i++) {
168+
const batch = batches[i];
169+
logger?.info?.(
170+
`feishu_doc: Inserting batch ${i + 1}/${batches.length} (${batch.blocks.length} blocks)...`,
171+
);
172+
173+
const children = await insertBatch(
174+
client,
175+
docToken,
176+
batch.blocks,
177+
batch.firstLevelIds,
178+
parentBlockId,
179+
currentIndex,
180+
);
181+
allChildren.push(...children);
182+
183+
// Advance index only for explicit positions; -1 always means "after last inserted"
184+
if (currentIndex !== -1) {
185+
currentIndex += batch.firstLevelIds.length;
186+
}
187+
}
188+
189+
return { children: allChildren, skipped: [] };
190+
}

0 commit comments

Comments
 (0)