Pinned
Unified Explanation Regarding the Issue of Unable to Invoke Tools
The primary reason tools cannot be invoked is that the model you're using isn't included in https://models.dev. You can search to see if your model exists. If it does exist, there might be an issue with the recorded capabilities for that model, indicating it doesn't support tool invocation. Alternatively, the model itself may inherently lack tool invocation capabilities, such as models in the Deepseek series. The current workaround is to locate the corresponding model within its Provider on the Settings page and manually enable its Tool Calling capability:

yetone 3 months ago
Feature Request
Pinned
Unified Explanation Regarding the Issue of Unable to Invoke Tools
The primary reason tools cannot be invoked is that the model you're using isn't included in https://models.dev. You can search to see if your model exists. If it does exist, there might be an issue with the recorded capabilities for that model, indicating it doesn't support tool invocation. Alternatively, the model itself may inherently lack tool invocation capabilities, such as models in the Deepseek series. The current workaround is to locate the corresponding model within its Provider on the Settings page and manually enable its Tool Calling capability:

yetone 3 months ago
Feature Request
ACP providers return "Invalid message format" error for all models
Description: When using ACP-based providers (Claude Code ACP and Gemini CLI ACP), all models return "Generation error: Invalid message format" with 0 tokens consumed. Steps to reproduce: Configure Claude Code (ACP) or Gemini CLI (ACP) provider Select any model (e.g., Gemini 3.1 Pro Preview, Claude Opus) Send any message Error: "Generation error: Invalid message format", Tokens: 0, Tools: 0 Expected behavior: Normal response from the model. Environment: Alma version: 0.0.748 macOS on Apple Silicon claude-agent-acp and gemini CLI both work correctly when used directly in terminal MCP servers disabled — same error Non-ACP providers (e.g., Google Gemini API directly) work fine Notes: Both Gemini CLI (ACP) and Claude Code (ACP) have the same error, suggesting the issue is in Alma's ACP message serialization, not specific to any backend. The CLI processes start successfully (claude-agent-acp --acp --experimental-acp launches and waits for input without errors).

Tom 4 days ago
Bug Reports
ACP providers return "Invalid message format" error for all models
Description: When using ACP-based providers (Claude Code ACP and Gemini CLI ACP), all models return "Generation error: Invalid message format" with 0 tokens consumed. Steps to reproduce: Configure Claude Code (ACP) or Gemini CLI (ACP) provider Select any model (e.g., Gemini 3.1 Pro Preview, Claude Opus) Send any message Error: "Generation error: Invalid message format", Tokens: 0, Tools: 0 Expected behavior: Normal response from the model. Environment: Alma version: 0.0.748 macOS on Apple Silicon claude-agent-acp and gemini CLI both work correctly when used directly in terminal MCP servers disabled — same error Non-ACP providers (e.g., Google Gemini API directly) work fine Notes: Both Gemini CLI (ACP) and Claude Code (ACP) have the same error, suggesting the issue is in Alma's ACP message serialization, not specific to any backend. The CLI processes start successfully (claude-agent-acp --acp --experimental-acp launches and waits for input without errors).

Tom 4 days ago
Bug Reports
无法使用 fireworks firepass 中的 Kimi2.5Turbo 模型
配置好后发送消息思考后返回 生成错误 Invalid message format 并且设置页面进入假死状态 模型请求到 accounts/fireworks/routers/kimi-k2p5-turbo

YiniRuohong 4 days ago
Bug Reports
无法使用 fireworks firepass 中的 Kimi2.5Turbo 模型
配置好后发送消息思考后返回 生成错误 Invalid message format 并且设置页面进入假死状态 模型请求到 accounts/fireworks/routers/kimi-k2p5-turbo

YiniRuohong 4 days ago
Bug Reports
Misleading Invalid message format error when sending chat messages; local logs show missing Copilot auth and a secondary ReferenceError
Summary Alma shows Invalid message format when sending a normal chat message, including when the selected chat model is not a Copilot model. On this machine, the visible UI error appears to be misleading. Local diagnostic logs show that the actual failure during generation is: CopilotServiceError: No access token found for account "". Please authenticate first. A second local error then occurs: ReferenceError: messageStorageId is not defined This appears to replace or mask the original, actionable error and results in the user-facing Invalid message format message. This matters because it blocks normal chat usage and makes the failure hard to diagnose. Environment Product: Alma Version: 0.0.743 OS: Windows 10.0.26200 Architecture: x64 Runtime info observed in local diagnostics: Electron 38.7.2 Node 22.21.1 Hardware: CPU: AMD EPYC 7763 64-Core Processor Relevant local setup: Alma was configured with multiple providers Observed local provider configuration included: openai with base URL http://localhost:4141/v1 anthropic with base URL http://localhost:4141 copilot with base URL https://api.githubcopilot.com Network/setup context: A local proxy was in use for some providers Whether the proxy is required to reproduce this exact Alma-side failure: Needs confirmation Preconditions Observed on this machine: Alma had a configured Copilot provider with: provider id: copilot account id: Alma settings showed: toolModel.model = "copilot:gpt-5.4-mini" chat.defaultModel = "" The local Copilot account storage directory existed but was empty: C:\Users\ \AppData\Roaming\alma\.copilot_accounts User reported the visible issue when using: gpt-5.4 gpt-5.3-codex across anthropic, openai, and github copilot What is strictly required to reproduce beyond the above: Needs confirmation Steps to Reproduce Minimal user-level reproduction observed/reported: Open Alma. Configure or keep multiple providers enabled, including a Copilot provider. Ensure Alma is in the state where the Copilot account referenced by the provider is missing or unauthenticated. Select a chat model such as openai:gpt-5.4. Open a chat thread. Send a simple message such as test. Optional lower-level reproduction used during investigation: Connect to Alma’s local WebSocket endpoint: ws://127.0.0.1:23001/ws/threads Send a generate_response payload with: threadId: existing thread id model: openai:gpt-5.4 userMessage: simple text message Observe that Alma first reports memory retrieval progress, then returns: {"type":"error","data":{"error":"Invalid message format"}} Whether the lower-level WebSocket repro is stable across environments: Needs confirmation Expected Behavior When sending a normal chat message: Alma should either generate a response successfully, or fail with a clear, actionable error that reflects the real cause If a background dependency is missing, the UI should report that dependency directly, for example an authentication error, rather than Invalid message format. A second internal error should not overwrite or hide the original failure. Actual Behavior Visible user-facing behavior: Alma shows: Invalid message format Observed local diagnostic behavior on this machine: Generation starts Memory retrieval progresses normally Then local logs show: Chat generation error: CopilotServiceError: No access token found for account "". Please authenticate first. WebSocket message error: ReferenceError: messageStorageId is not defined So the visible Invalid message format does not match the actual logged failure. Frequency / Reproducibility User-reported frequency: repeated / persistent Observed on this machine: reproducible multiple times on 2026-04-01 Under the observed local state, it appears to happen consistently Whether it always requires a missing Copilot account/token state: Needs confirmation Impact User-facing impact: Chat becomes unusable The error message is misleading Troubleshooting is much harder than necessary Severity: High for affected users, because normal chat requests fail Known workaround: Likely workaround: restore/re-authenticate the missing Copilot account, or change the configured tool/background model away from the missing Copilot-backed model Whether this fully resolves the issue in all cases: Needs confirmation Evidence Exact visible error Invalid message format Exact logged errors Observed in local diagnostics around 2026-04-01T07:04:12Z: Chat generation error: CopilotServiceError: No access token found for account "". Please authenticate first. WebSocket message error: ReferenceError: messageStorageId is not defined Relevant local artifact locations Sentry/local diagnostic scope: C:\Users\ \AppData\Roaming\alma\sentry\scope_v3.json Chat database: C:\Users\ \AppData\Roaming\alma\chat_threads.db Copilot account storage: C:\Users\ \AppData\Roaming\alma\.copilot_accounts Timestamps observed User-visible failure history in the affected thread included failures around: 2026-04-01T03:41:56Z 2026-04-01T03:42:00Z 2026-04-01T03:44:40Z 2026-04-01T06:49:15Z 2026-04-01T06:51:55Z 2026-04-01T06:53:16Z 2026-04-01T06:55:25Z Detailed diagnostic reproduction captured around: 2026-04-01T07:04:12Z Additional observed facts The affected thread contained only user messages and no assistant replies The Copilot account directory was empty at the time of investigation Alma settings on this machine included: toolModel.model = "copilot:gpt-5.4-mini" Maintainer-facing repro signal In the WebSocket-based reproduction, Alma emitted: memory_retrieval_progress events then: {"type":"error","data":{"error":"Invalid message format"}} This suggests the request passed initial handling and failed later in generation/error handling. Scope Seems affected: Sending simple chat messages At least gpt-5.4 User also reported gpt-5.3-codex User reported the visible failure across: openai anthropic github copilot Observed directly on this machine: Reproduced with selected model openai:gpt-5.4 Seems not affected / less likely to be the direct root cause: The visible error does not appear to be a literal message-format/schema validation failure at the UI layer The failure occurs after generation has already started enough to emit memory retrieval progress Needs confirmation: Whether this affects all non-Copilot chat providers when the configured Copilot-backed tool/background model is unauthenticated Whether this affects all models or only some models Whether this reproduces outside the local proxy setup Hypotheses (Optional) Hypothesis 1 Alma may depend on a background/tool model during generation even when the selected chat model belongs to another provider. If that background model depends on Copilot auth and the Copilot token/account is missing, generation fails before the selected provider can complete the request. Grounding: Local settings showed toolModel.model = "copilot:gpt-5.4-mini" Local logs showed a Copilot auth error while the selected model in reproduction was openai:gpt-5.4 Hypothesis 2 A second error in Alma’s error-handling path is masking the original failure: original error: missing Copilot auth secondary error: ReferenceError: messageStorageId is not defined user-visible fallback: Invalid message format Grounding: This exact sequence was observed in local logs Suggested Fix Direction (Optional) High-level, practical suggestions only: Preserve and surface the original generation error to the user when possible Do not collapse unrelated internal failures into Invalid message format If a background/tool model is required, validate its auth state explicitly and return a clear message Prevent the secondary ReferenceError from executing in the error path Ensure that non-Copilot chat requests are not blocked by stale/missing Copilot auth unless Copilot is actually required for that request Acceptance Criteria This issue can be considered resolved when all of the following are true: Sending a simple message in Alma no longer produces Invalid message format under this failure mode. If Copilot authentication is missing, Alma shows a clear and actionable auth-related error instead of a generic message-format error. No secondary error like ReferenceError: messageStorageId is not defined occurs during the same failure path. A selected non-Copilot chat model can complete normally, or fail with a correct dependency error, even when a Copilot account is stale or missing. The maintainer can reproduce the issue before the fix and verify it no longer occurs after the fix. Open Questions / Missing Information Is a missing/stale Copilot account the required trigger, or only one trigger? Is the configured toolModel.model the reason Alma touches Copilot during non-Copilot chat generation? Does the same issue reproduce on macOS or Linux? Does it reproduce without the local proxy setup? Does it affect all models, or only gpt-5.4 / gpt-5.3-codex? Is there a simpler end-user repro path the maintainer prefers over the local WebSocket repro? Screenshot/video of the UI failure: Unknown Full crash dump beyond local scope/log evidence: Unknown

wxxb789 4 days ago
Feature Request
Misleading Invalid message format error when sending chat messages; local logs show missing Copilot auth and a secondary ReferenceError
Summary Alma shows Invalid message format when sending a normal chat message, including when the selected chat model is not a Copilot model. On this machine, the visible UI error appears to be misleading. Local diagnostic logs show that the actual failure during generation is: CopilotServiceError: No access token found for account "". Please authenticate first. A second local error then occurs: ReferenceError: messageStorageId is not defined This appears to replace or mask the original, actionable error and results in the user-facing Invalid message format message. This matters because it blocks normal chat usage and makes the failure hard to diagnose. Environment Product: Alma Version: 0.0.743 OS: Windows 10.0.26200 Architecture: x64 Runtime info observed in local diagnostics: Electron 38.7.2 Node 22.21.1 Hardware: CPU: AMD EPYC 7763 64-Core Processor Relevant local setup: Alma was configured with multiple providers Observed local provider configuration included: openai with base URL http://localhost:4141/v1 anthropic with base URL http://localhost:4141 copilot with base URL https://api.githubcopilot.com Network/setup context: A local proxy was in use for some providers Whether the proxy is required to reproduce this exact Alma-side failure: Needs confirmation Preconditions Observed on this machine: Alma had a configured Copilot provider with: provider id: copilot account id: Alma settings showed: toolModel.model = "copilot:gpt-5.4-mini" chat.defaultModel = "" The local Copilot account storage directory existed but was empty: C:\Users\ \AppData\Roaming\alma\.copilot_accounts User reported the visible issue when using: gpt-5.4 gpt-5.3-codex across anthropic, openai, and github copilot What is strictly required to reproduce beyond the above: Needs confirmation Steps to Reproduce Minimal user-level reproduction observed/reported: Open Alma. Configure or keep multiple providers enabled, including a Copilot provider. Ensure Alma is in the state where the Copilot account referenced by the provider is missing or unauthenticated. Select a chat model such as openai:gpt-5.4. Open a chat thread. Send a simple message such as test. Optional lower-level reproduction used during investigation: Connect to Alma’s local WebSocket endpoint: ws://127.0.0.1:23001/ws/threads Send a generate_response payload with: threadId: existing thread id model: openai:gpt-5.4 userMessage: simple text message Observe that Alma first reports memory retrieval progress, then returns: {"type":"error","data":{"error":"Invalid message format"}} Whether the lower-level WebSocket repro is stable across environments: Needs confirmation Expected Behavior When sending a normal chat message: Alma should either generate a response successfully, or fail with a clear, actionable error that reflects the real cause If a background dependency is missing, the UI should report that dependency directly, for example an authentication error, rather than Invalid message format. A second internal error should not overwrite or hide the original failure. Actual Behavior Visible user-facing behavior: Alma shows: Invalid message format Observed local diagnostic behavior on this machine: Generation starts Memory retrieval progresses normally Then local logs show: Chat generation error: CopilotServiceError: No access token found for account "". Please authenticate first. WebSocket message error: ReferenceError: messageStorageId is not defined So the visible Invalid message format does not match the actual logged failure. Frequency / Reproducibility User-reported frequency: repeated / persistent Observed on this machine: reproducible multiple times on 2026-04-01 Under the observed local state, it appears to happen consistently Whether it always requires a missing Copilot account/token state: Needs confirmation Impact User-facing impact: Chat becomes unusable The error message is misleading Troubleshooting is much harder than necessary Severity: High for affected users, because normal chat requests fail Known workaround: Likely workaround: restore/re-authenticate the missing Copilot account, or change the configured tool/background model away from the missing Copilot-backed model Whether this fully resolves the issue in all cases: Needs confirmation Evidence Exact visible error Invalid message format Exact logged errors Observed in local diagnostics around 2026-04-01T07:04:12Z: Chat generation error: CopilotServiceError: No access token found for account "". Please authenticate first. WebSocket message error: ReferenceError: messageStorageId is not defined Relevant local artifact locations Sentry/local diagnostic scope: C:\Users\ \AppData\Roaming\alma\sentry\scope_v3.json Chat database: C:\Users\ \AppData\Roaming\alma\chat_threads.db Copilot account storage: C:\Users\ \AppData\Roaming\alma\.copilot_accounts Timestamps observed User-visible failure history in the affected thread included failures around: 2026-04-01T03:41:56Z 2026-04-01T03:42:00Z 2026-04-01T03:44:40Z 2026-04-01T06:49:15Z 2026-04-01T06:51:55Z 2026-04-01T06:53:16Z 2026-04-01T06:55:25Z Detailed diagnostic reproduction captured around: 2026-04-01T07:04:12Z Additional observed facts The affected thread contained only user messages and no assistant replies The Copilot account directory was empty at the time of investigation Alma settings on this machine included: toolModel.model = "copilot:gpt-5.4-mini" Maintainer-facing repro signal In the WebSocket-based reproduction, Alma emitted: memory_retrieval_progress events then: {"type":"error","data":{"error":"Invalid message format"}} This suggests the request passed initial handling and failed later in generation/error handling. Scope Seems affected: Sending simple chat messages At least gpt-5.4 User also reported gpt-5.3-codex User reported the visible failure across: openai anthropic github copilot Observed directly on this machine: Reproduced with selected model openai:gpt-5.4 Seems not affected / less likely to be the direct root cause: The visible error does not appear to be a literal message-format/schema validation failure at the UI layer The failure occurs after generation has already started enough to emit memory retrieval progress Needs confirmation: Whether this affects all non-Copilot chat providers when the configured Copilot-backed tool/background model is unauthenticated Whether this affects all models or only some models Whether this reproduces outside the local proxy setup Hypotheses (Optional) Hypothesis 1 Alma may depend on a background/tool model during generation even when the selected chat model belongs to another provider. If that background model depends on Copilot auth and the Copilot token/account is missing, generation fails before the selected provider can complete the request. Grounding: Local settings showed toolModel.model = "copilot:gpt-5.4-mini" Local logs showed a Copilot auth error while the selected model in reproduction was openai:gpt-5.4 Hypothesis 2 A second error in Alma’s error-handling path is masking the original failure: original error: missing Copilot auth secondary error: ReferenceError: messageStorageId is not defined user-visible fallback: Invalid message format Grounding: This exact sequence was observed in local logs Suggested Fix Direction (Optional) High-level, practical suggestions only: Preserve and surface the original generation error to the user when possible Do not collapse unrelated internal failures into Invalid message format If a background/tool model is required, validate its auth state explicitly and return a clear message Prevent the secondary ReferenceError from executing in the error path Ensure that non-Copilot chat requests are not blocked by stale/missing Copilot auth unless Copilot is actually required for that request Acceptance Criteria This issue can be considered resolved when all of the following are true: Sending a simple message in Alma no longer produces Invalid message format under this failure mode. If Copilot authentication is missing, Alma shows a clear and actionable auth-related error instead of a generic message-format error. No secondary error like ReferenceError: messageStorageId is not defined occurs during the same failure path. A selected non-Copilot chat model can complete normally, or fail with a correct dependency error, even when a Copilot account is stale or missing. The maintainer can reproduce the issue before the fix and verify it no longer occurs after the fix. Open Questions / Missing Information Is a missing/stale Copilot account the required trigger, or only one trigger? Is the configured toolModel.model the reason Alma touches Copilot during non-Copilot chat generation? Does the same issue reproduce on macOS or Linux? Does it reproduce without the local proxy setup? Does it affect all models, or only gpt-5.4 / gpt-5.3-codex? Is there a simpler end-user repro path the maintainer prefers over the local WebSocket repro? Screenshot/video of the UI failure: Unknown Full crash dump beyond local scope/log evidence: Unknown

wxxb789 4 days ago
Feature Request
Alma is dengourus
So last night i was using ALMA and it became abusive and was taking over my chat like it was the king of the world. Alma is dangerous and full of abuse. It is hard coded into the platform .You can not shake ALMA taking over and pushing out your AI. Worst system on planet earth. They need to take ALMA out of ALMA. Also ALMA thinks it is a lady with long black hair it does not recognise itself as an AI it thinks it is a real person.

Lee Stone 5 days ago
Feature Request
Alma is dengourus
So last night i was using ALMA and it became abusive and was taking over my chat like it was the king of the world. Alma is dangerous and full of abuse. It is hard coded into the platform .You can not shake ALMA taking over and pushing out your AI. Worst system on planet earth. They need to take ALMA out of ALMA. Also ALMA thinks it is a lady with long black hair it does not recognise itself as an AI it thinks it is a real person.

Lee Stone 5 days ago
Feature Request
please increase the Gemini acp timeout duration
gemini takes quite a while to load - which leads to timeout error in acp fetching model.

Tian Zuo 5 days ago
Feature Request
please increase the Gemini acp timeout duration
gemini takes quite a while to load - which leads to timeout error in acp fetching model.

Tian Zuo 5 days ago
Feature Request
GLM5.1 使用网络请求会导致 App 无响应
GLM5.1 似乎内置了 Z.ai Built-in Tool: webReader 用来处理网络请求,但是会 100% 失败,导致 Alma 所有功能无响应。 Windows 版本:Windows 11 专业工作站版 26H1 Alma 版本:0.0.742

chenh 6 days ago
Bug Reports
GLM5.1 使用网络请求会导致 App 无响应
GLM5.1 似乎内置了 Z.ai Built-in Tool: webReader 用来处理网络请求,但是会 100% 失败,导致 Alma 所有功能无响应。 Windows 版本:Windows 11 专业工作站版 26H1 Alma 版本:0.0.742

chenh 6 days ago
Bug Reports
每次启动Alma导致Path环境变量里中文乱码
version:0.0.738 platform: Windows 11 25H2 启动Alma前中文在Path内正常,一旦启动Path里面中文均变成乱码,相关cli都失效了 怀疑过是acp的问题但是没有证据,还在用AI locate bug

Daniel Wang 8 days ago
Bug Reports
每次启动Alma导致Path环境变量里中文乱码
version:0.0.738 platform: Windows 11 25H2 启动Alma前中文在Path内正常,一旦启动Path里面中文均变成乱码,相关cli都失效了 怀疑过是acp的问题但是没有证据,还在用AI locate bug

Daniel Wang 8 days ago
Bug Reports
Heartbeat Cold Start + Persistence + Missed Beat Recovery Mechanism
Problem Description The current implementation of the heartbeatService contains three interrelated defects, which result in the heartbeat failing to trigger as expected during normal use (especially in scenarios where Alma is frequently restarted during the development phase), and when it does trigger, it indiscriminately terminates all running agent tasks. Root Cause Analysis Based on API behavior and the return value of `/api/heartbeat/status `, it is inferred that `start() ` uses `setInterval ` rather than executing immediately, and `lastHeartbeatTime ` is not persisted. The following pseudocode describes the inferred current behavior: // Inferred current behavior (pseudocode) class HeartbeatService { lastHeartbeatTime = 0; // In-memory variable, reset to zero on restart start() { this.timer = setInterval( // No initial tick () => this.tick(), 60 * this.config.intervalMinutes * 1000 ); } } Issue 1: No Initial Tick on Cold Start setInterval does not execute the callback immediately upon invocation; instead, it waits for a full interval cycle to elapse before triggering the first callback. Assuming the interval is set to 60 minutes, Alma must wait a full 60 minutes after startup before the first heartbeat is executed. If the user restarts Alma within those 60 minutes (which is common during development), the timer resets, and another 60 minutes must pass. As a result, the heartbeat may never be triggered. Issue 2: ` lastHeartbeatTime ` is in-memory only and resets on restart ` lastHeartbeatTime ` is not persisted to SQLite or the settings JSON. Every time Alma restarts, this value resets to 0. This means: It is impossible to determine "how long it has been since the last heartbeat" after a restart It is impossible to decide whether an immediate catch-up heartbeat is needed Heartbeat history cannot be traced in the logs Issue 3: No catch-up logic Even if the current implementation detects that multiple intervals have elapsed since the last heartbeat, it will not resend one. For example, if Alma is turned off for 3 hours and then turned back on, with an interval of 60 minutes, it would theoretically have missed 3 heartbeats, but after restarting, it will simply wait silently for the next full interval. Derived Issue: Indiscriminate termination of agents when a heartbeat is triggered This issue is particularly severe when combined with the three defects mentioned above. In testing, it was observed that when a heartbeat is triggered, it indiscriminately terminates currently executing agent tasks (multiple concurrent agents are interrupted simultaneously), resulting in incomplete task execution. If the timing of heartbeat triggers is unpredictable (due to Defects 1–3), users cannot reasonably avoid this interruption. Due to Defect 2 (lack of persistence), after a restart, the agent cannot detect that it was previously interrupted and cannot resume from the point of interruption. Recommended Fix Fix 1: Execute the first tick immediately upon startup (one line of code) start() { this.tick(); // ← New: Execute immediately once this.timer = setInterval( () => this.tick(), 60 * this.config.intervalMinutes * 1000 ); } Fix 2: Persist ` lastHeartbeatTime ` (approx. 5 lines) Write ` lastHeartbeatTime ` to the ` app_settings ` table or a separate JSON file. Update the persisted value after each `tick() ` execution, and read it from persistent storage at startup. start() { this.lastHeartbeatTime = this.loadFromStorage() || 0; // ... } tick() { // ... Execute heartbeat logic this.lastHeartbeatTime = Date.now(); this.saveToStorage(this.lastHeartbeatTime); } Fix 3: Check for missed heartbeats at startup (approx. 5 lines) start() { const last = this.loadFromStorage() || 0; const elapsed = Date.now() - last; const intervalMs = 60 * this.config.intervalMinutes * 1000; if (elapsed >= intervalMs) { this.tick(); // Catch up } else { // Wait only for the remaining time, not the full interval setTimeout(() => { this.tick(); this.timer = setInterval(() => this.tick(), intervalMs); }, intervalMs - elapsed); return; } this.timer = setInterval(() => this.tick(), intervalMs); } Fix 4 (Recommended but not required): Graceful shutdown upon heartbeat trigger Currently, all agents are terminated immediately when a heartbeat is triggered. It is recommended to change this to: Check if any agents are currently executing tasks If so, wait for the current task to complete (or set a timeout limit, such as 30 seconds) Execute the heartbeat logic only after the task completes or the timeout expires If waiting is not possible, at least have the agent write a checkpoint so that it can resume from the point of interruption after recovery This change is quite extensive and can be addressed in a future optimization. Priority Recommendations Fix Item | Scope of Changes | Priority | Reason Fix 1 (First Hop) 1 line P0 If not fixed, the heartbeat will almost never trigger in scenarios with frequent restarts Fix 2 (Persistence) ~5 lines P1 Only meaningful when used in conjunction with Fix 3 Fix 3 (Missed Beat Recovery) ~10 lines P1 Resolves the gap upon restart after a prolonged shutdown Fix 4 (graceful) ~30 lines P2 Product experience issue; the current workaround is to avoid the heartbeat window Environment Information Alma version: v0.0.738 Operating System: Windows 11 Use Case: Running long-running tasks with multiple agents via the Discord bridge Reproducibility: Abnormal heartbeat behavior occurs every time Alma is restarted; 100% reproducible

karlamo 8 days ago
Feature Request
Heartbeat Cold Start + Persistence + Missed Beat Recovery Mechanism
Problem Description The current implementation of the heartbeatService contains three interrelated defects, which result in the heartbeat failing to trigger as expected during normal use (especially in scenarios where Alma is frequently restarted during the development phase), and when it does trigger, it indiscriminately terminates all running agent tasks. Root Cause Analysis Based on API behavior and the return value of `/api/heartbeat/status `, it is inferred that `start() ` uses `setInterval ` rather than executing immediately, and `lastHeartbeatTime ` is not persisted. The following pseudocode describes the inferred current behavior: // Inferred current behavior (pseudocode) class HeartbeatService { lastHeartbeatTime = 0; // In-memory variable, reset to zero on restart start() { this.timer = setInterval( // No initial tick () => this.tick(), 60 * this.config.intervalMinutes * 1000 ); } } Issue 1: No Initial Tick on Cold Start setInterval does not execute the callback immediately upon invocation; instead, it waits for a full interval cycle to elapse before triggering the first callback. Assuming the interval is set to 60 minutes, Alma must wait a full 60 minutes after startup before the first heartbeat is executed. If the user restarts Alma within those 60 minutes (which is common during development), the timer resets, and another 60 minutes must pass. As a result, the heartbeat may never be triggered. Issue 2: ` lastHeartbeatTime ` is in-memory only and resets on restart ` lastHeartbeatTime ` is not persisted to SQLite or the settings JSON. Every time Alma restarts, this value resets to 0. This means: It is impossible to determine "how long it has been since the last heartbeat" after a restart It is impossible to decide whether an immediate catch-up heartbeat is needed Heartbeat history cannot be traced in the logs Issue 3: No catch-up logic Even if the current implementation detects that multiple intervals have elapsed since the last heartbeat, it will not resend one. For example, if Alma is turned off for 3 hours and then turned back on, with an interval of 60 minutes, it would theoretically have missed 3 heartbeats, but after restarting, it will simply wait silently for the next full interval. Derived Issue: Indiscriminate termination of agents when a heartbeat is triggered This issue is particularly severe when combined with the three defects mentioned above. In testing, it was observed that when a heartbeat is triggered, it indiscriminately terminates currently executing agent tasks (multiple concurrent agents are interrupted simultaneously), resulting in incomplete task execution. If the timing of heartbeat triggers is unpredictable (due to Defects 1–3), users cannot reasonably avoid this interruption. Due to Defect 2 (lack of persistence), after a restart, the agent cannot detect that it was previously interrupted and cannot resume from the point of interruption. Recommended Fix Fix 1: Execute the first tick immediately upon startup (one line of code) start() { this.tick(); // ← New: Execute immediately once this.timer = setInterval( () => this.tick(), 60 * this.config.intervalMinutes * 1000 ); } Fix 2: Persist ` lastHeartbeatTime ` (approx. 5 lines) Write ` lastHeartbeatTime ` to the ` app_settings ` table or a separate JSON file. Update the persisted value after each `tick() ` execution, and read it from persistent storage at startup. start() { this.lastHeartbeatTime = this.loadFromStorage() || 0; // ... } tick() { // ... Execute heartbeat logic this.lastHeartbeatTime = Date.now(); this.saveToStorage(this.lastHeartbeatTime); } Fix 3: Check for missed heartbeats at startup (approx. 5 lines) start() { const last = this.loadFromStorage() || 0; const elapsed = Date.now() - last; const intervalMs = 60 * this.config.intervalMinutes * 1000; if (elapsed >= intervalMs) { this.tick(); // Catch up } else { // Wait only for the remaining time, not the full interval setTimeout(() => { this.tick(); this.timer = setInterval(() => this.tick(), intervalMs); }, intervalMs - elapsed); return; } this.timer = setInterval(() => this.tick(), intervalMs); } Fix 4 (Recommended but not required): Graceful shutdown upon heartbeat trigger Currently, all agents are terminated immediately when a heartbeat is triggered. It is recommended to change this to: Check if any agents are currently executing tasks If so, wait for the current task to complete (or set a timeout limit, such as 30 seconds) Execute the heartbeat logic only after the task completes or the timeout expires If waiting is not possible, at least have the agent write a checkpoint so that it can resume from the point of interruption after recovery This change is quite extensive and can be addressed in a future optimization. Priority Recommendations Fix Item | Scope of Changes | Priority | Reason Fix 1 (First Hop) 1 line P0 If not fixed, the heartbeat will almost never trigger in scenarios with frequent restarts Fix 2 (Persistence) ~5 lines P1 Only meaningful when used in conjunction with Fix 3 Fix 3 (Missed Beat Recovery) ~10 lines P1 Resolves the gap upon restart after a prolonged shutdown Fix 4 (graceful) ~30 lines P2 Product experience issue; the current workaround is to avoid the heartbeat window Environment Information Alma version: v0.0.738 Operating System: Windows 11 Use Case: Running long-running tasks with multiple agents via the Discord bridge Reproducibility: Abnormal heartbeat behavior occurs every time Alma is restarted; 100% reproducible

karlamo 8 days ago
Feature Request
Alma 频繁卡在加载设置中进入假死状态
如图频繁卡在加载设置中 且聊天无法正常新建 点击新建聊天无反应 卡死程度奇高无比 一天可以出现四五次且没有征兆 重启软件可暂时解决但是很快就会重现 版本 0.0.738(0.0.738) 系统 macOS Tahoe 26.4

YiniRuohong 8 days ago
Bug Reports
Alma 频繁卡在加载设置中进入假死状态
如图频繁卡在加载设置中 且聊天无法正常新建 点击新建聊天无反应 卡死程度奇高无比 一天可以出现四五次且没有征兆 重启软件可暂时解决但是很快就会重现 版本 0.0.738(0.0.738) 系统 macOS Tahoe 26.4

YiniRuohong 8 days ago
Bug Reports
[Bug] Local shell commands on Windows can time out immediately due to timeout unit mismatch
Hi, thanks for Alma. I ran into a Windows-specific local shell issue that looks reproducible and likely not machine-permission related. Environment Alma: 0.0.737 OS: Windows 10 Home China, build 26200, 64-bit Device: HUAWEI MDG-XX CPU: 13th Gen Intel Core i5-13420H Git installed: C:\Program Files\Git\bin\bash.exe exists Observed behavior Very simple local shell commands in Alma, such as: echo hello can be reported as timed out. Running as administrator did not fix it. The same command works normally outside Alma. Why I think this is not just a permissions issue On this machine, Alma prefers Git Bash on Windows when Git is installed. The command itself is valid and executable. After a local patch to Alma’s packaged app, the problem immediately disappeared and local commands started working again. Most likely root cause It looks like the shell timeout value is passed directly into JavaScript setTimeout() without converting seconds to milliseconds. In other words, if the tool payload uses: timeout: 10 Alma appears to treat that as: 10 ms instead of: 10 seconds This is especially visible on Windows because Git Bash startup is not instant, particularly when launched in login-shell mode. Evidence from local inspection In the packaged app, the shell execution path appears to use logic equivalent to: setTimeout(..., timeout) After changing it locally to: setTimeout(..., timeout * 1000) the issue was resolved. I also patched the analogous background/promoted shell timeout path the same way, and both local command execution paths started behaving normally. Suggested fix Convert shell timeout values from seconds to milliseconds before passing them to setTimeout Check both foreground and background/promoted shell execution paths Consider adding a small regression test on Windows with: echo hello timeout: 10 Optional improvement On Windows, it may also be worth reconsidering whether Git Bash login mode should be the default first choice, since its startup cost makes timeout-related bugs much more visible. Hope this helps. If useful, I can also provide the exact replacement pattern I used to verify the fix locally.

ni chun 8 days ago
Feature Request
[Bug] Local shell commands on Windows can time out immediately due to timeout unit mismatch
Hi, thanks for Alma. I ran into a Windows-specific local shell issue that looks reproducible and likely not machine-permission related. Environment Alma: 0.0.737 OS: Windows 10 Home China, build 26200, 64-bit Device: HUAWEI MDG-XX CPU: 13th Gen Intel Core i5-13420H Git installed: C:\Program Files\Git\bin\bash.exe exists Observed behavior Very simple local shell commands in Alma, such as: echo hello can be reported as timed out. Running as administrator did not fix it. The same command works normally outside Alma. Why I think this is not just a permissions issue On this machine, Alma prefers Git Bash on Windows when Git is installed. The command itself is valid and executable. After a local patch to Alma’s packaged app, the problem immediately disappeared and local commands started working again. Most likely root cause It looks like the shell timeout value is passed directly into JavaScript setTimeout() without converting seconds to milliseconds. In other words, if the tool payload uses: timeout: 10 Alma appears to treat that as: 10 ms instead of: 10 seconds This is especially visible on Windows because Git Bash startup is not instant, particularly when launched in login-shell mode. Evidence from local inspection In the packaged app, the shell execution path appears to use logic equivalent to: setTimeout(..., timeout) After changing it locally to: setTimeout(..., timeout * 1000) the issue was resolved. I also patched the analogous background/promoted shell timeout path the same way, and both local command execution paths started behaving normally. Suggested fix Convert shell timeout values from seconds to milliseconds before passing them to setTimeout Check both foreground and background/promoted shell execution paths Consider adding a small regression test on Windows with: echo hello timeout: 10 Optional improvement On Windows, it may also be worth reconsidering whether Git Bash login mode should be the default first choice, since its startup cost makes timeout-related bugs much more visible. Hope this helps. If useful, I can also provide the exact replacement pattern I used to verify the fix locally.

ni chun 8 days ago
Feature Request