Comparing changes

## Summary - Diversify personality preset assignments across 6 of 7 builtin templates (dev_shop already fully diverse) so each org has a distinct "feel" - Fix role mismatches: Scrum Master no longer shares `process_optimizer` with COO; UX Researcher differentiated from UX Designer - All 4 previously unused presets (`team_diplomat`, `empathetic_mentor`, `rapid_prototyper`, `communication_bridge`) are now assigned -- all 20 presets utilized - Vary by seniority within templates: `eager_learner` reduced from 5 templates to 2 - QA Engineer presets diversified (agency QA now `process_optimizer`) - Update design spec example to match actual template values ### 22 preset changes across 6 YAML files + 1 doc fix | Template | Changes | Theme | |----------|---------|-------| | Solo Founder | 1 | Scrappy bootstrapper | | Tech Startup | 2 | Move fast, experiment | | Dev Shop | 0 | Already diverse (8 unique) | | Product Team | 3 | User-centered, collaborative | | Agency | 7 | Creative, client-focused delivery | | Research Lab | 2 | Academic, deep, methodical | | Full Company | 7 | Enterprise, structured | ## Test plan - [x] All 250 template unit tests pass (`uv run python -m pytest tests/unit/templates/ -n auto`) - [x] Full unit suite passes (10178 passed) - [x] Lint clean (`uv run ruff check src/ tests/`) - [x] All 20 personality presets now assigned to at least one template agent - [x] No tests assert on specific template-to-preset mappings (verified) ## Review coverage Pre-reviewed by 2 agents (docs-consistency, issue-resolution-verifier), 4 findings addressed. Closes #718 --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>

## Summary - Differentiate `autonomy.level`, `communication`, and `workflow` across 7 builtin YAML templates so each archetype has a meaningful operational profile - Add parametrized regression test covering all 7 templates' operational configs - Update design spec docs: Company Types table gains Autonomy/Communication/Workflow columns, cross-references added to Operations and Communication pages, stale "Recommended for Full Company" label on Hybrid pattern removed | Template | Autonomy | Communication | Workflow | |----------|----------|---------------|----------| | solo_founder | `full` | `event_driven` | `kanban` | | startup | `semi` | `hybrid` | `agile_kanban` | | dev_shop | `semi` | `hybrid` | `agile_kanban` | | product_team | `semi` | `meeting_based` | `agile_kanban` | | agency | `supervised` | `hierarchical` | `kanban` | | research_lab | `full` | `event_driven` | `kanban` | | full_company | `supervised` | `hierarchical` | `agile_kanban` | ## Test plan - [x] Parametrized test `TestBuiltinOperationalConfigs` asserts all 7 templates' autonomy, communication, and workflow values - [x] All 10,345 existing tests pass (93.68% coverage) - [x] Pre-reviewed by 5 agents (docs-consistency, code-reviewer, test-quality, conventions-enforcer, issue-resolution-verifier) -- 5 docs findings addressed Closes #717 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>

## Summary - Add `auto_cleanup` boolean config flag (`synthorg config set auto_cleanup true`) that automatically removes old Docker images after `synthorg update`, keeping only the current and previous version - Add hint after `synthorg cleanup` suggesting the `auto_cleanup` config flag - Auto-apply compose changes when only the version comment and image references differ (no more unnecessary diff prompt during normal updates) - Skip redundant "Checking for updates..." message after CLI self-update re-exec ## Test plan - [ ] `go -C cli test ./...` passes (all new tests included) - [ ] `go -C cli vet ./...` clean - [ ] `go -C cli tool golangci-lint run` reports 0 issues - [ ] `synthorg config set auto_cleanup true` persists correctly - [ ] `synthorg config set auto_cleanup false` persists correctly - [ ] `synthorg config set auto_cleanup yes` rejected with error - [ ] `synthorg config show` displays "Auto cleanup" field - [ ] `synthorg cleanup` shows hint about auto_cleanup when flag is disabled - [ ] `synthorg cleanup` does NOT show hint when auto_cleanup is already enabled - [ ] `synthorg update` auto-cleans old images when auto_cleanup is enabled - [ ] `synthorg update` shows old images hint (existing behavior) when auto_cleanup is disabled - [ ] Compose diff prompt no longer shown when only version comment + image digests change ## Review coverage Pre-reviewed by 3 agents (go-reviewer, go-conventions-enforcer, docs-consistency), 5 findings addressed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>

…747) ## Summary - Add `location /assets/` with `Cache-Control: public, max-age=31536000, immutable` for Vite's content-hashed JS/CSS output -- uses prefix match instead of regex to avoid caching unhashed `public/` files like `favicon.svg` - Add `location = /index.html` with `Cache-Control: no-cache` so browsers always revalidate the entry point and pick up new deployments (304s keep it fast) - Extract 6 security headers into `web/security-headers.conf` include snippet to avoid duplication across location blocks (nginx does not inherit `add_header` from server block into location blocks that define their own) - Add `charset utf-8` to server block - Add `proxy_http_version 1.1` to API proxy block (consistency with WebSocket block, enables keepalive to backend) ## Test plan - [x] Docker image builds successfully (`docker compose -f docker/compose.yml build web`) - [x] `curl -I /assets/index-*.js` returns `Cache-Control: public, max-age=31536000, immutable` + all 6 security headers - [x] `curl -I /assets/index-*.css` returns same cache + security headers - [x] `curl -I /` returns `Cache-Control: no-cache` + all 6 security headers - [x] `curl -I /nonexistent-route` (SPA fallback) returns `Cache-Control: no-cache` + all 6 security headers - [x] `curl -I /favicon.svg` (unhashed) returns security headers but NO long cache - [ ] Hadolint passes on Dockerfile changes (CI) ## Review coverage Pre-reviewed by 3 agents (docs-consistency, infra-reviewer, issue-resolution-verifier). 0 findings -- all checks pass, all acceptance criteria resolved. Closes #686 --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>

## Summary - Fix double-prefixed Ollama model names in `_parse_ollama_models()` -- the driver already prepends the provider name, so the `ollama/` prefix in probing caused `ollama/ollama/model-name` to be sent to LiteLLM - Add `faker` and `faker.factory` to third-party logger taming to suppress 150+ lines of locale resolution DEBUG noise in Docker logs - Move `format_exc_info` from shared `_BASE_PROCESSORS` to per-sink processors (JSON sinks only) to eliminate structlog UserWarning conflict with `ConsoleRenderer` - Remove log level selector from `synthorg init` wizard -- default is `info` per architecture spec, runtime changes available via web UI and `synthorg config set log_level` - Extract helpers to fix function length violations: `_build_formatter`, `_ensure_log_dir`, `_handle_sink_failure`, `printInitSuccess` - Update `docs/design/operations.md` to reflect faker taming and JSON-only `format_exc_info` - Add CLAUDE.md carve-out for observability bootstrap logging (stdlib `logging` + `print` to stderr) - Add tests for `format_exc_info` processor chain placement and Go `config set log_level` round-trip ## Test plan - [x] All 10,349 Python tests pass (93.68% coverage) - [x] All Go CLI tests pass - [x] mypy strict mode clean - [x] ruff lint + format clean - [x] golangci-lint + go vet clean - [x] Pre-commit and pre-push hooks pass - [ ] After deploy: re-run `/analyse-logs` to confirm no Ollama 404, no faker noise, no structlog warning, console at INFO Pre-reviewed by 8 agents (docs-consistency, code-reviewer, python-reviewer, pr-test-analyzer, logging-audit, conventions-enforcer, go-reviewer, go-conventions-enforcer), 13 findings addressed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>

## Summary - **Fix**: Dev release notes showed the stable version (e.g., `:0.4.7`) in container pull commands instead of the dev version (e.g., `:0.4.8-dev.4`) - **Root cause**: The `VERSION` env var in docker.yml's "Append container images to release" step used `app_version` from `pyproject.toml`, which always contains the stable version - **Fix**: Derive the image tag from the git tag (`github.ref_name`) with `v` prefix stripped. For stable releases the result is identical; for dev releases it now correctly shows the dev version Note: The actual Docker images were always tagged correctly -- only the release notes pull commands were wrong. ## Test plan - [ ] Verify stable release `v0.4.7` still produces `:0.4.7` in pull commands (tag `v0.4.7` -> `${TAG#v}` = `0.4.7`) - [ ] Verify dev release `v0.4.8-dev.4` now produces `:0.4.8-dev.4` in pull commands (tag `v0.4.8-dev.4` -> `${TAG#v}` = `0.4.8-dev.4`) - [ ] Next dev release on main confirms correct image tags in release notes ## Review coverage Quick mode (CI-only change, no agents needed). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>

… templates (#745) ## Summary - Add optional `subordinate_id`/`supervisor_id` fields to `ReportingLine` model for disambiguating agents that share the same role name (e.g. multiple "Backend Developer" agents with different merge_ids) - Update self-report and unique-subordinate validators to use explicit branching (both IDs present -> compare IDs; neither -> compare names; asymmetric -> always different agents) - Populate `reporting_lines`, `workflow_handoffs`, and `escalation_paths` across 6 builtin templates: - **startup**: 3 reporting lines, 1 handoff, 2 escalations - **dev_shop**: 5 reporting lines, 2 handoffs, 2 escalations - **research_lab**: 4 reporting lines, 3 handoffs, 2 escalations - **product_team**: 5 reporting lines, 5 handoffs, 3 escalations - **agency**: 7 reporting lines, 6 handoffs (client delivery loop), 3 escalations - **full_company**: 15 reporting lines (with Jinja2 loops for dynamic devs), 8 handoffs, 5 escalations - Document `reporting_lines` with `subordinate_id`/`supervisor_id` in `docs/design/organization.md` - Budget allocations reviewed -- all existing values are reasonable for their template type (no changes needed) ### Notes - `solo_founder.yaml` skipped (only 2 agents, no meaningful hierarchy) - `full_company.yaml` product department has `head_role: "Product Manager"` but contains CPO (c_suite) -- pre-existing issue, not addressed here - `HierarchyResolver` in `communication/delegation/hierarchy.py` does not yet consume the new ID fields -- tracked separately as it affects runtime hierarchy resolution in a different module ## Test plan - [x] 14 new unit tests for ReportingLine id fields (acceptance, self-report with ids, asymmetric cases, blank-id rejection, case-insensitive duplicate ids) - [x] All 10,359 tests pass (93.68% coverage) - [x] All builtin templates load and render successfully (validated by existing `test_all_builtins_load_successfully` and `test_render_all_builtins_produce_valid_root_config`) - [x] mypy strict: no issues - [x] ruff lint + format: clean - [x] Pre-reviewed by 7 agents, 10 findings addressed Closes #719 --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>

## Summary - Switch xdist from the default `load` scheduler to `worksteal` via `addopts` in `pyproject.toml` - Workers that finish fast tests early now steal work from slower workers, improving load balance when test durations vary widely (0.01s to 10s range) - Benchmarked locally: ~170s to ~125s for 10,195 unit tests (~26% reduction, 32 workers) ## What was investigated and rejected | Option | Result | Why rejected | |--------|--------|--------------| | `--import-mode=importlib` | Slightly worse | No collection speedup for this codebase | | `-p no:cacheprovider` | ~1s improvement | Negligible, not worth config noise | | Module-scoped API fixtures | Net negative | Requires `loadscope`/`loadfile` distribution which is less efficient than `worksteal` at balancing 10K+ tests across 32 workers | ## Test plan - [x] Full unit test suite passes (10,190 passed, 5 skipped -- same as baseline) - [x] No new test failures introduced - [x] Pre-commit hooks pass 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>

## Summary - Add `subordinate_key`/`supervisor_key` `@computed_field` properties to `ReportingLine` that return `subordinate_id`/`supervisor_id` when set, falling back to the role name - Update `HierarchyResolver.__init__` to use these properties as dict keys in the explicit reporting lines loop, so agents sharing a role name but with different IDs are properly distinguished - Add 11 tests: 4 for computed properties, 7 for hierarchy resolution with IDs (disambiguation, chain-walking, cycle detection, override, backward compatibility) ## Test plan - [x] `uv run python -m pytest tests/unit/core/test_company_reporting.py tests/unit/communication/delegation/test_hierarchy.py -n auto -v` -- all 91 tests pass - [x] `uv run mypy src/ tests/` -- clean - [x] `uv run ruff check src/ tests/` -- clean - [x] Full test suite: 10,382 passed, 93.68% coverage - [x] Pre-reviewed by 9 agents (code-reviewer, python-reviewer, test-analyzer, type-design, logging-audit, resilience-audit, conventions-enforcer, docs-consistency, issue-verifier), 4 findings addressed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Closes #746 --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>

…, and display names (#752) ## Summary - Add `SkillPattern` enum (`tool_wrapper`, `generator`, `reviewer`, `inversion`, `pipeline`) based on Google Cloud's five-pattern taxonomy for agent collaboration - Classify all 7 builtin templates by which skill patterns their agents use - Rewrite template descriptions from vague one-liners to informative 2-3 sentence summaries - Expand tags with capability-specific values (`ci-cd`, `code-review`, `user-research`, `data-pipeline`, etc.) - Update display names for clarity: Solo Founder -> Solo Builder, Dev Shop -> Engineering Squad, Product Team -> Product Studio, Full Company -> Enterprise Org - Surface `tags` and `skill_patterns` through `TemplateInfo`, REST API (`TemplateInfoResponse`), and TypeScript interface - Add `SkillPattern` uniqueness validator on `TemplateMetadata` - Add TypeScript `SkillPattern` union type for type-safe frontend usage - Update design spec with Skill Pattern Taxonomy section and fix pre-existing display name discrepancies in `communication.md` and `operations.md` ## Test plan - [x] `TestBuiltinSkillPatterns`: 4 tests verifying all builtins have non-empty valid patterns, matrix covers all builtins, and expected patterns match - [x] `TestTemplateMetadata`: `skill_patterns` defaults to `()`, round-trips with enum values, duplicate patterns rejected - [x] `TestLoadTemplateFile`: invalid skill pattern in YAML raises `TemplateValidationError` - [x] `TestSetupTemplates`: API response includes `tags` and `skill_patterns` fields - [x] All 1187 source files pass mypy strict - [x] All unit tests pass (336 template + setup tests, full suite green) - [x] Pre-reviewed by 7 agents, 8 findings addressed Closes #698 --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>

🤖 I have created a release *beep* *boop* --- ## [0.4.8](v0.4.7...v0.4.8) (2026-03-22) ### Features * add auto_cleanup config and improve update UX ([#741](#741)) ([289638f](289638f)) * add reporting lines, escalation paths, and workflow handoffs to templates ([#745](#745)) ([c374cc9](c374cc9)) * differentiate template operational configs ([#742](#742)) ([9b48345](9b48345)) * diversify personality preset assignments across templates ([#743](#743)) ([15487a5](15487a5)) * improve template metadata -- skill taxonomy, descriptions, tags, and display names ([#752](#752)) ([f333f24](f333f24)) ### Bug Fixes * resolve log analysis findings (Ollama prefix, logging, init) ([#748](#748)) ([8f871a4](8f871a4)) * use git tag for dev release container image tags ([#749](#749)) ([f30d071](f30d071)) * use subordinate_id/supervisor_id in HierarchyResolver ([#751](#751)) ([118235b](118235b)) ### Performance * add long-lived cache headers for content-hashed static assets ([#747](#747)) ([4d350b5](4d350b5)) * use worksteal distribution for pytest-xdist ([#750](#750)) ([b7dd7de](b7dd7de)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparing changes

Open a pull request

Commits on Mar 22, 2026

This comparison is taking too long to generate.

Uh oh!