When I attached an image to chat and asked for a description, it worked as expected. However, on subsequent messages, it would pause and not answer. I found the following in the Hermes Agent errors.log:
2026-04-03 22:30:57,471 ERROR root: API call failed after 3 retries. HTTP 400: JSON parse error: Cannot construct instance of com.wd.paas.api.domain.v4.chat.ChatAttachments (although at least one Creator exists): no String-argument constructor/factory method to deserialize from String value ('yoga-pose.jpg'); nested exception is com.fasterxml.jackson.databind.exc.MismatchedI | provider=zai model=glm-5.1 msgs=27 tokens=***
When I asked Hermes why I was getting this error, it replied:
There's the problem. The _sanitize_api_messagesmethod only filters by role allowlist and handles orphaned tool calls. It does NOT strip non-standard fields likeattachments, _ts, timestamp`, etc. from messages before sending them to the provider API.
So here's the full picture:
- You send a message with an image attachment (yoga-pose.jpg)
- The WebUI stores the user message with
attachments: ['yoga-pose.jpg'] in the session history
- On the follow-up message,
run_conversation replays the full conversation history including that user message with the attachments field
_sanitize_api_messages only strips invalid roles and fixes orphaned tool calls — it passes the attachments field through to the API call
- The ZAI/GLM API sees a message with
attachments: "yoga-pose.jpg" and tries to deserialize it as a ChatAttachments object, but gets a plain string instead of the expected object structure → JSON parse error
Root cause: The attachments field (a list of filename strings) that the WebUI adds to user messages for display purposes leaks into the API payload. The GLM provider API tries to interpret it as its native attachment format and fails because it gets ["yoga-pose.jpg"] (strings) instead of proper ChatAttachments objects.
The fix would be in _sanitize_api_messages — add a step to strip non-standard fields from messages, leaving only the fields the API expects (role, content, tool_calls, tool_call_id, name). Something like:
# Strip display-only fields that shouldn't reach the API
_ALLOWED_MSG_KEYS = {'role', 'content', 'tool_calls', 'tool_call_id', 'name', 'refusal'}
for msg in messages:
extra = set(msg.keys()) - _ALLOWED_MSG_KEYS
for k in extra:
msg.pop(k, None)
This would clean out attachments, _ts, timestamp, and any other UI-only metadata before the messages hit the provider API.`
When I attached an image to chat and asked for a description, it worked as expected. However, on subsequent messages, it would pause and not answer. I found the following in the Hermes Agent errors.log:
2026-04-03 22:30:57,471 ERROR root: API call failed after 3 retries. HTTP 400: JSON parse error: Cannot construct instance ofcom.wd.paas.api.domain.v4.chat.ChatAttachments(although at least one Creator exists): no String-argument constructor/factory method to deserialize from String value ('yoga-pose.jpg'); nested exception is com.fasterxml.jackson.databind.exc.MismatchedI | provider=zai model=glm-5.1 msgs=27 tokens=***When I asked Hermes why I was getting this error, it replied:
There's the problem. The_sanitize_api_messagesmethod only filters by role allowlist and handles orphaned tool calls. It does NOT strip non-standard fields likeattachments,_ts,timestamp`, etc. from messages before sending them to the provider API.So here's the full picture:
attachments: ['yoga-pose.jpg']in the session historyrun_conversationreplays the full conversation history including that user message with theattachmentsfield_sanitize_api_messagesonly strips invalid roles and fixes orphaned tool calls — it passes theattachmentsfield through to the API callattachments: "yoga-pose.jpg"and tries to deserialize it as aChatAttachmentsobject, but gets a plain string instead of the expected object structure → JSON parse errorRoot cause: The
attachmentsfield (a list of filename strings) that the WebUI adds to user messages for display purposes leaks into the API payload. The GLM provider API tries to interpret it as its native attachment format and fails because it gets["yoga-pose.jpg"](strings) instead of properChatAttachmentsobjects.The fix would be in
_sanitize_api_messages— add a step to strip non-standard fields from messages, leaving only the fields the API expects (role,content,tool_calls,tool_call_id,name). Something like:This would clean out
attachments,_ts,timestamp, and any other UI-only metadata before the messages hit the provider API.`