Autonomous code hardening via IES (Invariant Enforcement Score) optimization.
Wraps Claude to analyze a repository's invariants, build an evaluation harness, and iteratively improve enforcement — from unguarded (0) through convention (1) and validated (2) to structural (3).
Demo repo: savarin/ledger — see the Result section below.
/plugin marketplace add savarin/lean-refine
/plugin install lean-refine@lean-refine
Then invoke in any repo:
/lean-refine:refine # default: Lean formalization + optimization
/lean-refine:refine --direct # skip formalization, code reading only
/lean-refine:refine --tag my-hardening # custom branch name
uv tool install .refine <repo> # default: Lean formalization + optimization (requires Lean 4)
refine <repo> --direct # skip formalization, code reading only
refine <repo> --model sonnet # use a different model
refine <repo> --tag my-hardening # branch: lean-refine/my-hardening
refine <repo> --skip-permissions # autonomous operation (no permission prompts)
refine <repo> --max-turns 300 # increase turn limit (default: 200)- Analyzes the repo — identifies 5-15 critical invariants
- Builds
prepare.py— an evaluation harness that scores each invariant - Executes known improvements — pre-planned targets, sequentially
- Explores further improvements — AUTORESEARCH discovery loop
- Stops when ceiling is reached or all invariants are covered
Output lives in <repo>/.lean-refine/ on a lean-refine/<tag> branch.
ledger.py — a 13-line double-entry ledger. refine ledger-repo discovers five invariants the code assumes but doesn't enforce:
- Balances must be numeric (passing a string silently corrupts arithmetic)
- Transfer amounts must be positive (negative amounts reverse direction silently)
- Accounts can't be duplicated (
create_accountsilently overwrites existing balances) - Transfers require both accounts to exist (missing target debits source then crashes, losing money)
- Self-transfers are meaningless (works by accident, breaks on any refactor)
Lean formalization surfaces these by forcing explicit type declarations (e.g., defining Balance with val : Nat makes negativity structurally impossible, revealing that the Python code allows it silently).
IES: 0.0476 → 1.0000 (perfect score) in 12 iterations, 0 discards.
| Iteration | IES | Change | Description | Code |
|---|---|---|---|---|
| 0 | 0.05 | — | Baseline: 6 unguarded, 1 convention | — |
| 1 | 0.24 | +0.19 | Account existence check in transfer (also caught atomicity) |
L52-53 |
| 2 | 0.33 | +0.10 | Account existence check in balance |
L57-58 |
| 3 | 0.43 | +0.10 | Transfer amount positivity check | L16-17 |
| 4 | 0.48 | +0.05 | Atomic dict rebuild for transfer | L54 |
| 5 | 0.57 | +0.10 | Account overwrite protection | L45-46 |
| 6 | 0.67 | +0.10 | Self-transfer prevention | L30-31 |
| 7 | 0.71 | +0.05 | Runtime isinstance type checks | L5-6, L14-15 |
| 8 | 0.76 | +0.05 | Balance class (structural type safety) |
L1-7 |
| 9 | 0.81 | +0.05 | PositiveAmount class (structural positivity) |
L10-18 |
| 10 | 0.86 | +0.05 | setdefault (structural overwrite protection) |
L47 |
| 11 | 0.90 | +0.05 | TransferRequest class (structural self-transfer) |
L26-35 |
| 12 | 1.00 | +0.10 | AccountRef type (structural account existence) |
L21-23, L50, L56 |
All 7 invariants structural. 20 tests passing. Artifacts in .lean-refine/ with Lean formalization, frozen harness, results TSV, and iteration log.
class Ledger:
def __init__(self):
self.accounts = {}
def create_account(self, account_id, opening_balance):
self.accounts[account_id] = opening_balance
def transfer(self, from_id, to_id, amount):
self.accounts[from_id] -= amount
self.accounts[to_id] += amount
def balance(self, account_id):
return self.accounts[account_id]class Balance:
__slots__ = ("value",)
def __init__(self, value: int | float) -> None:
if not isinstance(value, (int, float)):
raise TypeError("value must be numeric")
self.value = value
class PositiveAmount:
__slots__ = ("value",)
def __init__(self, value: int | float) -> None:
if not isinstance(value, (int, float)):
raise TypeError("amount must be numeric")
if value <= 0:
raise ValueError("amount must be positive")
self.value = value
class AccountRef(str):
"""Verified account identifier obtained from create_account."""
pass
class TransferRequest:
__slots__ = ("from_id", "to_id", "amount")
def __init__(self, from_id: str, to_id: str, amount: int | float) -> None:
if from_id == to_id:
raise ValueError("cannot transfer to same account")
PositiveAmount(amount)
self.from_id = from_id
self.to_id = to_id
self.amount = amount
class Ledger:
def __init__(self):
self.accounts = {}
def create_account(self, account_id: str, opening_balance: int | float) -> AccountRef:
Balance(opening_balance)
ref = AccountRef(account_id)
if ref in self.accounts:
raise ValueError("account already exists")
self.accounts.setdefault(ref, opening_balance)
return ref
def transfer(self, from_id: AccountRef, to_id: AccountRef, amount: int | float) -> None:
request = TransferRequest(from_id, to_id, amount)
if request.from_id not in self.accounts or request.to_id not in self.accounts:
raise ValueError("account does not exist")
self.accounts = {**self.accounts, request.from_id: self.accounts[request.from_id] - request.amount, request.to_id: self.accounts[request.to_id] + request.amount}
def balance(self, account_id: AccountRef):
if account_id not in self.accounts:
raise ValueError("account does not exist")
return self.accounts[account_id]- Claude Code CLI (
claudeon PATH) - Python >= 3.10
- Lean 4 (
leanandlakeon PATH) — required for default mode, not needed with--direct