Summary
Add distill pipeline command and /v1/pipeline endpoint for full context optimization.
Motivation
Users want a single command that runs the complete optimization: dedup → compress → summarize → cache. This simplifies integration and provides maximum reduction.
Components
CLI Usage
distill pipeline --input chunks.jsonl \
--dedup-threshold 0.15 \
--compress-ratio 0.5 \
--summarize-max-tokens 4000 \
--output optimized.jsonl
API Request
POST /v1/pipeline
{
"chunks": [...],
"options": {
"dedup": { "threshold": 0.15, "enabled": true },
"compress": { "target_reduction": 0.5, "enabled": true },
"summarize": { "max_tokens": 4000, "enabled": true }
}
}
API Response
{
"chunks": [...],
"stats": {
"original_tokens": 50000,
"final_tokens": 4200,
"total_reduction": 0.916,
"stages": {
"dedup": { "reduction": 0.35, "duration_ms": 12 },
"compress": { "reduction": 0.52, "duration_ms": 45 },
"summarize": { "reduction": 0.28, "duration_ms": 30 }
}
}
}
Acceptance Criteria
Summary
Add
distill pipelinecommand and/v1/pipelineendpoint for full context optimization.Motivation
Users want a single command that runs the complete optimization: dedup → compress → summarize → cache. This simplifies integration and provides maximum reduction.
Components
distill pipelinePOST /v1/pipelineCLI Usage
API Request
API Response
{ "chunks": [...], "stats": { "original_tokens": 50000, "final_tokens": 4200, "total_reduction": 0.916, "stages": { "dedup": { "reduction": 0.35, "duration_ms": 12 }, "compress": { "reduction": 0.52, "duration_ms": 45 }, "summarize": { "reduction": 0.28, "duration_ms": 30 } } } }Acceptance Criteria