fix(QuantizedSliding): 收紧识别取值并支持 Filtered 拼接#1620
Merged
Conversation
Contributor
There was a problem hiding this comment.
Hey - 我发现了两个问题,并留下了一些整体性的反馈:
- 在
joinFilteredOCRText中,建议将当前手写的 O(n²) 双重循环排序替换为sort.Slice,这样可以让意图更清晰,并且在Filtered包含大量片段时具有更好的可扩展性。 - 英文开发者文档(
docs/en_us/developers/quantized-sliding.md)中有一条项目符号被截断,结尾是 “- The maximum value is less than or equal to”,请补全或移除这行,以避免读者困惑。
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `joinFilteredOCRText`, consider replacing the manual O(n²) nested-loop sort with `sort.Slice` for clearer intent and better scalability when `Filtered` contains many fragments.
- The English developer doc (`docs/en_us/developers/quantized-sliding.md`) has a truncated bullet point ending with '`- The maximum value is less than or equal to `'—please complete or remove this line to avoid confusing readers.
## Individual Comments
### Comment 1
<location path="agent/go-service/quantizedsliding/ocr.go" line_range="26-27" />
<code_context>
- if ocr, ok := recognitionDetail.Results.Best.AsOCR(); ok {
- return []int{ocr.Box.X(), ocr.Box.Y(), ocr.Box.Width(), ocr.Box.Height()}, true
- }
+ if len(detail.Box) >= 4 {
+ return []int{detail.Box[0], detail.Box[1], detail.Box[2], detail.Box[3]}, true
}
</code_context>
<issue_to_address>
**issue (bug_risk):** Potentially using the wrong hit box when a named child detail is found but has only a Box
In `readHitBox`, after preferring `candidate := findRecognitionDetailByName(detail, nodeQuantizedSlidingSwipeButton)`, you fall back directly to `detail.Box` when `readTemplateMatchBestBox(candidate)` fails. This skips `candidate.Box` entirely when `candidate != detail`, even if `candidate` has a valid box and `detail` does not, so you can end up returning the wrong (or empty) box. To preserve priority for the swipe button node, the fallback should first check `candidate.Box`, then fall back to `detail.Box` only if `candidate != detail` and `detail.Box` is valid.
</issue_to_address>
### Comment 2
<location path="docs/en_us/developers/quantized-sliding.md" line_range="336" />
<code_context>
-- Too many fine-tuning attempts still do not converge.
-
-The current implementation clamps a single fine-tuning branch to the range `0 ~ 30`, and `QuantizedSlidingCheckQuantity` has `max_hit = 4`. If those limits are exhausted and the target value is still not reached, the flow fails and enters `QuantizedSlidingFail`.
+- The maximum value is less than or equal to `
## Why fine-tuning buttons are still needed
</code_context>
<issue_to_address>
**issue:** This failure-condition bullet point appears truncated and incomplete, likely dropping important explanatory text.
The bullet `- The maximum value is less than or equal to `` is syntactically incomplete (unclosed backtick, missing condition) and no longer describes this failure mode. The previous explanatory text about fine-tuning limits and `QuantizedSlidingFail` also seems to have been removed. Please restore the full condition (including closing the inline code span) and reintroduce the explanation so this failure case remains clear to readers.
</issue_to_address>Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Original comment in English
Hey - I've found 2 issues, and left some high level feedback:
- In
joinFilteredOCRText, consider replacing the manual O(n²) nested-loop sort withsort.Slicefor clearer intent and better scalability whenFilteredcontains many fragments. - The English developer doc (
docs/en_us/developers/quantized-sliding.md) has a truncated bullet point ending with '- The maximum value is less than or equal to'—please complete or remove this line to avoid confusing readers.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `joinFilteredOCRText`, consider replacing the manual O(n²) nested-loop sort with `sort.Slice` for clearer intent and better scalability when `Filtered` contains many fragments.
- The English developer doc (`docs/en_us/developers/quantized-sliding.md`) has a truncated bullet point ending with '`- The maximum value is less than or equal to `'—please complete or remove this line to avoid confusing readers.
## Individual Comments
### Comment 1
<location path="agent/go-service/quantizedsliding/ocr.go" line_range="26-27" />
<code_context>
- if ocr, ok := recognitionDetail.Results.Best.AsOCR(); ok {
- return []int{ocr.Box.X(), ocr.Box.Y(), ocr.Box.Width(), ocr.Box.Height()}, true
- }
+ if len(detail.Box) >= 4 {
+ return []int{detail.Box[0], detail.Box[1], detail.Box[2], detail.Box[3]}, true
}
</code_context>
<issue_to_address>
**issue (bug_risk):** Potentially using the wrong hit box when a named child detail is found but has only a Box
In `readHitBox`, after preferring `candidate := findRecognitionDetailByName(detail, nodeQuantizedSlidingSwipeButton)`, you fall back directly to `detail.Box` when `readTemplateMatchBestBox(candidate)` fails. This skips `candidate.Box` entirely when `candidate != detail`, even if `candidate` has a valid box and `detail` does not, so you can end up returning the wrong (or empty) box. To preserve priority for the swipe button node, the fallback should first check `candidate.Box`, then fall back to `detail.Box` only if `candidate != detail` and `detail.Box` is valid.
</issue_to_address>
### Comment 2
<location path="docs/en_us/developers/quantized-sliding.md" line_range="336" />
<code_context>
-- Too many fine-tuning attempts still do not converge.
-
-The current implementation clamps a single fine-tuning branch to the range `0 ~ 30`, and `QuantizedSlidingCheckQuantity` has `max_hit = 4`. If those limits are exhausted and the target value is still not reached, the flow fails and enters `QuantizedSlidingFail`.
+- The maximum value is less than or equal to `
## Why fine-tuning buttons are still needed
</code_context>
<issue_to_address>
**issue:** This failure-condition bullet point appears truncated and incomplete, likely dropping important explanatory text.
The bullet `- The maximum value is less than or equal to `` is syntactically incomplete (unclosed backtick, missing condition) and no longer describes this failure mode. The previous explanatory text about fine-tuning limits and `QuantizedSlidingFail` also seems to have been removed. Please restore the full condition (including closing the inline code span) and reintroduce the explanation so this failure case remains clear to readers.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Contributor
There was a problem hiding this comment.
Pull request overview
该 PR 旨在收紧 QuantizedSliding 在 Go 侧对识别结果的取值边界(优先 typed 结果),并新增按 Results.Filtered 片段拼接数字的可选模式,以提升数量 OCR 的稳定性,同时同步更新 Pipeline 与中英文开发文档以匹配新行为。
Changes:
- Go 侧新增
ConcatAllFilteredDigits参数,并将命名节点/动作名抽为常量,统一注册、分发与 override 引用。 - OCR 读取逻辑改为 typed-first:滑块 box 优先读
SwipeButton.Results.Best.TemplateMatch;数量默认读Best OCR,可选改为拼接Filtered OCR。 - 调整共享 Pipeline 的滑动起点 offset,并在
AutoStockpile接入中显式启用ConcatAllFilteredDigits。
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/zh_cn/developers/quantized-sliding.md | 更新中文文档:新增参数说明与 typed-first/filtered/fallback 边界说明 |
| docs/en_us/developers/quantized-sliding.md | 更新英文文档:新增参数与边界说明(但当前存在列表内容被截断问题) |
| assets/resource/pipeline/QuantizedSliding/Main.json | 调整 Swipe begin_offset(共享流程滑动起点偏移) |
| assets/resource/pipeline/AutoStockpile/Task.json | AutoStockpile 显式开启 ConcatAllFilteredDigits |
| agent/go-service/quantizedsliding/types.go | 参数/动作结构体新增 ConcatAllFilteredDigits;节点列表迁移至 nodes.go |
| agent/go-service/quantizedsliding/register.go | 动作注册名改为引用常量 |
| agent/go-service/quantizedsliding/params.go | 解析/归一化/日志新增 ConcatAllFilteredDigits 字段 |
| agent/go-service/quantizedsliding/overrides.go | override 构造改用节点常量,数量过滤节点 key 也常量化 |
| agent/go-service/quantizedsliding/ocr.go | 新增 typed-first 的 hit-box/quantity 读取与 filtered 拼接逻辑 |
| agent/go-service/quantizedsliding/nodes.go | 新增:集中定义动作名、内部节点名、override key 常量与 action 节点列表 |
| agent/go-service/quantizedsliding/handlers.go | 分发与 override 引用改用节点常量;数量读取改用新 OCR 读取函数 |
- 抽离节点常量并统一 action 名称引用 - 滑块定位只读取 SwipeButton 的 Best 结果 - 数量 OCR 仅从 GetQuantity 的 Filtered 结果拼接
- 默认只读取 Best OCR 文本 - 支持按 y 再 x 拼接 Filtered 结果 - AutoStockpile 显式启用拼接模式
- 补充 ConcatAllFilteredDigits 的行为与示例 - 明确 Best、Filtered 与 DetailJson 的读取边界 - 同步中英文开发文档结构说明
3d34ede to
0cca312
Compare
Contributor
There was a problem hiding this comment.
Hey - 我发现了 4 个问题
给 AI 代理的提示
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="agent/go-service/quantizedsliding/ocr.go" line_range="141-150" />
<code_context>
+func joinFilteredOCRText(results *maa.RecognitionResults) string {
</code_context>
<issue_to_address>
**issue (bug_risk):** 取消片段去重可能会导致在拼接后的数量文本中出现重复的 OCR 片段。
之前的流程(`collectOCRFragments` + `uniqueOCRFragments`)会在拼接前按位置/尺寸移除重复片段。现在的 `joinFilteredOCRText` 会按顺序拼接所有已过滤的片段,不再去重,而且包含的位置信息更少。如果识别器可能在相同坐标处输出重叠或重复片段,就可能产生重复数字(例如,将 `"110"` 识别成 `"1110"`)。如果这种情况在你的流水线中有可能出现,请在新的 API 下增加一个轻量级的去重步骤(例如按 `(x, y, text)` 建键),以保留先前的行为。
</issue_to_address>
### Comment 2
<location path="docs/en_us/developers/quantized-sliding.md" line_range="104" />
<code_context>
+| `ConcatAllFilteredDigits` | `bool` | No | Quantity parse strategy switch. `false` (default): read only `Results.Best` OCR text. `true`: read all `Results.Filtered` OCR fragments, sort by y then x, concatenate, then parse. |
</code_context>
<issue_to_address>
**nitpick (typo):** 建议将 “Quantity parse strategy switch” 改为 “Quantity parsing strategy switch”,语句会更自然一些。
这是一个轻微的语法改进,以便与文档中其他地方对 parsing 的描述方式保持一致。
```suggestion
| `ConcatAllFilteredDigits` | `bool` | No | Quantity parsing strategy switch. `false` (default): read only `Results.Best` OCR text. `true`: read all `Results.Filtered` OCR fragments, sort by y then x, concatenate, then parse. |
```
</issue_to_address>
### Comment 3
<location path="docs/en_us/developers/quantized-sliding.md" line_range="216" />
<code_context>
If either of these prerequisites is not met, more accurate proportional calculations will not help.
+The current Go-side recognition read rules are intentionally narrow:
+
+- The slider hit box is read from `QuantizedSlidingSwipeButton` and prefers `Results.Best.AsTemplateMatch()`.
</code_context>
<issue_to_address>
**suggestion (typo):** “recognition read rules” 听起来有点别扭,可以考虑改成 “recognition reading rules”。
例如,你可以写成:“The current Go-side recognition reading rules are intentionally narrow:”,读起来会更自然一些。
```suggestion
The current Go-side recognition reading rules are intentionally narrow:
```
</issue_to_address>
### Comment 4
<location path="docs/en_us/developers/quantized-sliding.md" line_range="28" />
<code_context>
| `handlers.go` | `Run()` dispatch, per-stage handlers, and state reset |
| `overrides.go` | Pipeline override construction |
-| `ocr.go` | OCR text extraction and recognition box parsing |
+| `ocr.go` | Typed-first recognition helpers for hit-box and quantity reads |
| `normalize.go` | Button parameter normalization and basic calculation helpers |
| `register.go` | Registers the `QuantizedSliding` action into go-service |
</code_context>
<issue_to_address>
**nitpick (typo):** “hit box” 的连字符用法在 “hit-box” 和 “hit box” 之间不一致。
这里这一条使用的是 “hit-box”,而后面的一条使用的是 “hit box”。请在两处统一为同一种拼写(例如统一使用 “hit box”)。
```suggestion
| `ocr.go` | Typed-first recognition helpers for hit box and quantity reads |
```
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据这些反馈改进后续的评审。
Original comment in English
Hey - I've found 4 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="agent/go-service/quantizedsliding/ocr.go" line_range="141-150" />
<code_context>
+func joinFilteredOCRText(results *maa.RecognitionResults) string {
</code_context>
<issue_to_address>
**issue (bug_risk):** Dropping fragment de-duplication may cause repeated OCR fragments in the concatenated quantity text.
The previous flow (`collectOCRFragments` + `uniqueOCRFragments`) removed duplicate fragments (by position/size) before joining. `joinFilteredOCRText` now concatenates all filtered fragments in order, without de-duplication and with less positional data. If the recognizer can emit overlapping or duplicate fragments at the same coordinates, this can produce repeated digits (e.g., `"1110"` vs `"110"`). If that case occurs in your pipelines, please add a lightweight de-dup step (e.g., keyed by `(x, y, text)`) to retain the prior behavior under the new API.
</issue_to_address>
### Comment 2
<location path="docs/en_us/developers/quantized-sliding.md" line_range="104" />
<code_context>
+| `ConcatAllFilteredDigits` | `bool` | No | Quantity parse strategy switch. `false` (default): read only `Results.Best` OCR text. `true`: read all `Results.Filtered` OCR fragments, sort by y then x, concatenate, then parse. |
</code_context>
<issue_to_address>
**nitpick (typo):** Consider changing “Quantity parse strategy switch” to “Quantity parsing strategy switch” for more natural phrasing.
Minor grammar improvement for consistency with how parsing is described elsewhere in the docs.
```suggestion
| `ConcatAllFilteredDigits` | `bool` | No | Quantity parsing strategy switch. `false` (default): read only `Results.Best` OCR text. `true`: read all `Results.Filtered` OCR fragments, sort by y then x, concatenate, then parse. |
```
</issue_to_address>
### Comment 3
<location path="docs/en_us/developers/quantized-sliding.md" line_range="216" />
<code_context>
If either of these prerequisites is not met, more accurate proportional calculations will not help.
+The current Go-side recognition read rules are intentionally narrow:
+
+- The slider hit box is read from `QuantizedSlidingSwipeButton` and prefers `Results.Best.AsTemplateMatch()`.
</code_context>
<issue_to_address>
**suggestion (typo):** “recognition read rules” sounds a bit awkward; consider “recognition reading rules”.
For example, you could write: “The current Go-side recognition reading rules are intentionally narrow:” which reads a bit more naturally.
```suggestion
The current Go-side recognition reading rules are intentionally narrow:
```
</issue_to_address>
### Comment 4
<location path="docs/en_us/developers/quantized-sliding.md" line_range="28" />
<code_context>
| `handlers.go` | `Run()` dispatch, per-stage handlers, and state reset |
| `overrides.go` | Pipeline override construction |
-| `ocr.go` | OCR text extraction and recognition box parsing |
+| `ocr.go` | Typed-first recognition helpers for hit-box and quantity reads |
| `normalize.go` | Button parameter normalization and basic calculation helpers |
| `register.go` | Registers the `QuantizedSliding` action into go-service |
</code_context>
<issue_to_address>
**nitpick (typo):** Hyphenation of “hit box” is inconsistent between “hit-box” and “hit box”.
Here this entry uses “hit-box” while a later bullet uses “hit box”. Please standardize on one spelling (e.g., “hit box”) across both.
```suggestion
| `ocr.go` | Typed-first recognition helpers for hit box and quantity reads |
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Contributor
There was a problem hiding this comment.
Hey - 我发现了 1 个问题,并留下了一些总体反馈:
- 新的
readHitBox回退链有点难以理解,而且当前对detail.Box进行了两次检查;建议将优先级(SwipeButton Best → SwipeButton.Box → root.Box)简化为一组顺序清晰、不含重复条件的判断,这样更容易理解其行为。 readQuantityText始终优先使用已类型化的路径,但在所有失败情况下会静默返回空字符串;如果数量文本意外缺失(例如Results为 nil 或 Best 不是 OCR),建议在最终回退点添加调试日志,这样可以更容易诊断QuantizedSlidingGetQuantity的配置错误。
给 AI Agent 的提示
Please address the comments from this code review:
## Overall Comments
- The new `readHitBox` fallback chain is a bit hard to follow and currently checks `detail.Box` twice; consider simplifying the precedence (SwipeButton Best → SwipeButton.Box → root.Box) into a clearly ordered set of guards without duplicated conditions so the behavior is easier to reason about.
- `readQuantityText` always prefers typed paths but silently returns an empty string on all failures; if quantity text is unexpectedly missing (e.g., `Results` is nil or Best is non-OCR), consider adding debug logging at the final fallback point so misconfigurations of `QuantizedSlidingGetQuantity` are easier to diagnose.
## Individual Comments
### Comment 1
<location path="agent/go-service/quantizedsliding/ocr.go" line_range="35-37" />
<code_context>
- if ocr, ok := result.AsOCR(); ok {
- return []int{ocr.Box.X(), ocr.Box.Y(), ocr.Box.Width(), ocr.Box.Height()}, true
- }
+ if len(detail.Box) >= 4 {
+ return []int{detail.Box[0], detail.Box[1], detail.Box[2], detail.Box[3]}, true
}
</code_context>
<issue_to_address>
**suggestion:** The final `len(detail.Box) >= 4` check in `readHitBox` is redundant and can never change behavior.
Given the earlier branches:
- When `candidate == detail` and `len(detail.Box) >= 4`, the `if len(candidate.Box) >= 4` branch already returns.
- When `candidate != detail` and `len(detail.Box) >= 4`, the `if candidate != detail && len(detail.Box) >= 4` branch already returns.
So the final `if len(detail.Box) >= 4` is unreachable. It can be removed to simplify the control flow and clarify the intended precedence between `candidate` and `detail` boxes.
```suggestion
```
</issue_to_address>帮我变得更有用!请对每条评论点 👍 或 👎,我会根据这些反馈改进后续的评审。
Original comment in English
Hey - I've found 1 issue, and left some high level feedback:
- The new
readHitBoxfallback chain is a bit hard to follow and currently checksdetail.Boxtwice; consider simplifying the precedence (SwipeButton Best → SwipeButton.Box → root.Box) into a clearly ordered set of guards without duplicated conditions so the behavior is easier to reason about. readQuantityTextalways prefers typed paths but silently returns an empty string on all failures; if quantity text is unexpectedly missing (e.g.,Resultsis nil or Best is non-OCR), consider adding debug logging at the final fallback point so misconfigurations ofQuantizedSlidingGetQuantityare easier to diagnose.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new `readHitBox` fallback chain is a bit hard to follow and currently checks `detail.Box` twice; consider simplifying the precedence (SwipeButton Best → SwipeButton.Box → root.Box) into a clearly ordered set of guards without duplicated conditions so the behavior is easier to reason about.
- `readQuantityText` always prefers typed paths but silently returns an empty string on all failures; if quantity text is unexpectedly missing (e.g., `Results` is nil or Best is non-OCR), consider adding debug logging at the final fallback point so misconfigurations of `QuantizedSlidingGetQuantity` are easier to diagnose.
## Individual Comments
### Comment 1
<location path="agent/go-service/quantizedsliding/ocr.go" line_range="35-37" />
<code_context>
- if ocr, ok := result.AsOCR(); ok {
- return []int{ocr.Box.X(), ocr.Box.Y(), ocr.Box.Width(), ocr.Box.Height()}, true
- }
+ if len(detail.Box) >= 4 {
+ return []int{detail.Box[0], detail.Box[1], detail.Box[2], detail.Box[3]}, true
}
</code_context>
<issue_to_address>
**suggestion:** The final `len(detail.Box) >= 4` check in `readHitBox` is redundant and can never change behavior.
Given the earlier branches:
- When `candidate == detail` and `len(detail.Box) >= 4`, the `if len(candidate.Box) >= 4` branch already returns.
- When `candidate != detail` and `len(detail.Box) >= 4`, the `if candidate != detail && len(detail.Box) >= 4` branch already returns.
So the final `if len(detail.Box) >= 4` is unreachable. It can be removed to simplify the control flow and clarify the intended precedence between `candidate` and `detail` boxes.
```suggestion
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
summary:
close #1613
Summary by Sourcery
收紧 QuantizedSliding 的命中框(hit-box)和读数逻辑,使其优先使用带类型的 Best 结果;引入可配置的策略用于拼接已过滤的 OCR 数字;并在 QuantizedSliding 流水线中统一共享节点命名、重写规则和文档。
New Features:
ConcatAllFilteredDigits参数,用于控制数量 OCR 是仅使用 Best 结果,还是按位置顺序拼接所有 Filtered OCR 片段。Enhancements:
DetailJson。Documentation:
Original summary in English
Summary by Sourcery
Tighten QuantizedSliding hit-box and quantity reading to prefer typed Best results, introduce a configurable strategy for concatenating filtered OCR digits, and align shared node naming, overrides, and docs with the QuantizedSliding pipeline.
New Features:
Enhancements:
Documentation: