Using LLM's for model modification and refinement in constraint programming
- Parser agent (NL alignment):
python3 src/mod-ref-benchmark/parser_agent.py --problem src/mod-ref-benchmark/problems/problem1
Saves structured NL-to-code mapping JSON into the problem'sbase/folder. Add--provider openai(and setOPENAI_API_KEY) to use OpenAI. - Planner agent (change planning):
python3 src/mod-ref-benchmark/planner_agent.py --problem src/mod-ref-benchmark/problems/problem1 --cr CR1 --parser-json src/mod-ref-benchmark/problems/problem1/base/problem1_parser_<timestamp>.json
Uses CRdesc.json, parser mapping, base NL, and reference model to emit a structured edit plan saved in the CR folder. Add--provider openai(and setOPENAI_API_KEY) to use OpenAI. - Modifier agent (apply plan):
python3 src/mod-ref-benchmark/modifier_agent.py --problem src/mod-ref-benchmark/problems/problem1 --cr CR1 --planner-json src/mod-ref-benchmark/problems/problem1/CR1/problem1_CR1_planner_<timestamp>.json
Applies planner steps to rewritegenerated_model.pyinside the CR folder using the reference model plus CR context (no unit-test or validator loop yet). Add--provider openai(and setOPENAI_API_KEY) to use OpenAI. - Executor agent (sanity run):
python3 src/mod-ref-benchmark/executor_agent.py --problem src/mod-ref-benchmark/problems/problem1 --cr CR1
Runs the chosen model (defaultgenerated_model.py) in the CR folder and logs stdout JSON or execution errors. - Validator agent (LLM review):
python3 src/mod-ref-benchmark/validator_agent.py --problem src/mod-ref-benchmark/problems/problem1 --cr CR1
LLM-only review comparing generated model vs. reference model and CR; emits structured feedback (pass/needs_changes) for iterative loops with the modifier. Add--provider openai(and setOPENAI_API_KEY) to use OpenAI. - LangGraph workflow (orchestrates all agents):
python3 src/mod-ref-benchmark/langgraph_workflow/workflow.py --problem-path src/mod-ref-benchmark/problems/problem1 --cr CR1
Chains Parser → Planner → Modifier → Executor → Validator (loops on executor/validator issues), then runs the CR unit test on validator pass. Writes a single workflow log JSON plus a separate unit-test result file in the CR folder. Defaults to--provider openai(requiresOPENAI_API_KEY).