Skip to content

fix: LFM2 multiple tool calls#316

Merged
HenryNdubuaku merged 1 commit intocactus-compute:mainfrom
mhayes853:fix-lfm2-multiple-tool-calls
Feb 4, 2026
Merged

fix: LFM2 multiple tool calls#316
HenryNdubuaku merged 1 commit intocactus-compute:mainfrom
mhayes853:fix-lfm2-multiple-tool-calls

Conversation

@mhayes853
Copy link
Copy Markdown
Contributor

Parsing logic was incorrect when lfm2 would emit multiple tool calls.

eg.

[{"name":"send_message","arguments":{"recipient":"Blob","message":"Checking the weather in San Francisco.","get_weather(location":"San Francisco"}}]

The call for get_weather is embedded weirdly in the call for send_message.

This PR corrects the parsing logic inside the FFI utils, and adds a new test for handling multiple tool call invocations.

@HenryNdubuaku
Copy link
Copy Markdown
Collaborator

@mhayes853 Thanks for this! does the tool call pass on your end?

╔══════════════════════════════════════════╗
║          MULTIPLE TOOLS TEST             ║
╚══════════════════════════════════════════╝
├─ User prompt: Send a message to Blob and get the weather for San Francisco.
Response: I'm sorry, I don't have access to a tool to send messages or retrieve weather data directly. However, if you'd like the current weather for San Francisco, I can help with that using the available tool. Let me know how I can assist!

[Results]
├─ Function call: YES
├─ Correct tool: NO
  "success": true,
  "error": null,
  "cloud_handoff": false,
  "response": "I'm sorry, I don't have access to a tool to send messages or retrieve weather data directly. However, if you'd like the current weather for San Francisco, I can help with that using the available tool. Let me know how I can assist!",
  "function_calls": [],
  "confidence": 0.9716,
  "time_to_first_token_ms": 829.07,
  "total_time_ms": 1798.55,
  "prefill_tps": 331.70,
  "decode_tps": 52.61,
  "ram_usage_mb": 51.33,
  "prefill_tokens": 275,
  "decode_tokens": 52,
  "total_tokens": 327
└─ Status: FAILED ✗
✗ FAIL │ tool_multiple_tool_call_invocations

@mhayes853
Copy link
Copy Markdown
Contributor Author

Seems to be model dependent (I've had a 100% success rate with larger models >=1.2B params running on M1 pro).

If you want, I can loosen the assertion a bit in the test for now to not expect both tool calls outright. I mainly needed it to verify that the parsing logic was fine if the model invoked both tools.

@HenryNdubuaku
Copy link
Copy Markdown
Collaborator

Ok makes sense, it seems the 1.2B version is capable of that.

@HenryNdubuaku HenryNdubuaku merged commit 5bfec07 into cactus-compute:main Feb 4, 2026
1 of 2 checks passed
ncylich pushed a commit that referenced this pull request Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants