Skip to content

fix(reply): normalize Windows media paths for dedupe#44228

Open
aniruddhaadak80 wants to merge 2 commits intoopenclaw:mainfrom
aniruddhaadak80:fix/windows-reply-media-dedupe
Open

fix(reply): normalize Windows media paths for dedupe#44228
aniruddhaadak80 wants to merge 2 commits intoopenclaw:mainfrom
aniruddhaadak80:fix/windows-reply-media-dedupe

Conversation

@aniruddhaadak80
Copy link
Copy Markdown

Summary

  • Problem: reply-media dedupe compared Windows drive-letter local paths and file:///C:/... URLs as different strings.
  • Why it matters: replies can resend media the messaging tool already delivered on Windows hosts.
  • What changed: normalized Windows drive-letter file URLs and local paths into the same dedupe form, and added a regression test that covers both slash styles.
  • What did NOT change (scope boundary): non-Windows media normalization and the rest of reply suppression logic are unchanged.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • Windows reply dedupe now treats file:///C:/..., C:\..., and C:/... media references as the same local file when deciding whether to strip duplicates.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)

Repro + Verification

Environment

  • OS: Windows
  • Runtime/container: local Node/pnpm workspace
  • Model/provider: N/A
  • Integration/channel (if any): auto-reply / messaging tool dedupe path
  • Relevant config (redacted): N/A

Steps

  1. Send media via a messaging tool on Windows so the sent record contains a local drive-letter path.
  2. Produce a reply payload that references the same file through a file:///C:/... URL.
  3. Run reply-media dedupe.

Expected

  • The duplicate media reference should be stripped.

Actual

  • The new regression test passes and the payload media is removed as expected.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: ran pnpm exec vitest run src/auto-reply/reply/reply-payloads.test.ts.
  • Additional gate: ran pnpm build:plugin-sdk:dts.
  • What I did not verify: additional UNC-path semantics remain out of scope for this small fix.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)

Failure Recovery (if this breaks)

  • Revert the reply-payload normalization change and test.
  • Known bad symptoms reviewers should watch for: unexpected dedupe behavior for unusual Windows path formats outside drive-letter paths.

Copilot AI review requested due to automatic review settings March 12, 2026 16:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes Windows-specific reply media dedupe mismatches by normalizing drive-letter paths so that local paths (C:\... / C:/...) and file:///C:/... URLs compare equivalently during dedupe filtering.

Changes:

  • Normalize Windows absolute drive-letter local paths by canonicalizing separators before dedupe comparison.
  • Normalize file:///C:/... URLs into the same canonical Windows local-path form during dedupe comparison.
  • Add a regression test covering file:///C:/... vs both C:\... and C:/....

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/auto-reply/reply/reply-payloads.ts Adds Windows drive-letter normalization in filterMessagingToolMediaDuplicates so equivalent local path / file URL forms dedupe correctly.
src/auto-reply/reply/reply-payloads.test.ts Adds regression coverage for Windows drive-letter file:///C:/... URLs deduping against local path variants.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 12, 2026

Greptile Summary

This PR fixes a Windows-specific media deduplication bug where file:///C:/… URLs and local drive-letter paths (C:\…, C:/…) were compared as different strings, causing replies to resend already-delivered media. The fix adds a WINDOWS_ABSOLUTE_PATH_RE helper and extends normalizeMediaForDedupe to canonicalize all three forms to a backslash-separated path before Set lookup.

Key observations:

  • The core normalization logic is correct: file:///C:/tmp/photo%20one.jpgC:\tmp\photo one.jpg, C:/tmp/photo one.jpgC:\tmp\photo one.jpg, C:\tmp\photo one.jpg → unchanged. All three forms land in the same Set bucket.
  • Drive letter case is not normalized. file:///c:/tmp/file.jpg (lowercase c) normalizes to c:\tmp\file.jpg while a stored record of C:\tmp\file.jpg (uppercase C) normalizes to C:\tmp\file.jpg — these do not match. Since Windows paths are case-insensitive, this is a real deduplication gap that the current test suite does not cover (all test variants use uppercase C:).
  • Scope is appropriately narrow; UNC paths and non-Windows normalization are unchanged.

Confidence Score: 3/5

  • The fix is correct for same-case drive letters but will silently miss deduplication when drive letter casing differs between the stored record and the reply payload.
  • The normalization logic works correctly for the exact cases tested (uppercase C: on both sides, slash-style variations, URL percent-encoding). However, Windows file paths are case-insensitive and drive letters can legitimately appear as lowercase in one record and uppercase in another. The omission of drive-letter case folding means the fix is incomplete: a file:///c:/… payload (lowercase) will not dedupe against a stored C:\… record (uppercase), leaving the original resend bug alive in that scenario. Confidence is further reduced because no test exercises the case-mismatch path.
  • src/auto-reply/reply/reply-payloads.ts lines 117–124 (normalization logic) and the corresponding test in src/auto-reply/reply/reply-payloads.test.ts (missing lowercase drive-letter coverage)
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/reply-payloads.ts
Line: 117-124

Comment:
**Drive letter case not normalized**

The normalization converts all variants to backslash form (e.g. `C:\path`) but does not lowercase the drive letter. This means `file:///c:/tmp/photo.jpg` (lowercase `c`) normalizes to `c:\tmp\photo.jpg`, which will **not** match a `sentMediaUrls` entry of `C:\tmp\photo.jpg` (uppercase `C`), even though on Windows these refer to the same file.

Windows paths are case-insensitive, so a sent record stored with one casing and a reply payload referencing the other casing will silently fail to dedupe.

Consider folding the drive letter to a consistent case before returning:

```typescript
// non-file:// branch
return WINDOWS_ABSOLUTE_PATH_RE.test(trimmed)
  ? trimmed[0].toLowerCase() + trimmed.slice(1).replaceAll("/", "\\")
  : trimmed;
```

```typescript
// file: branch
if (/^\/[a-zA-Z]:\//.test(decodedPath)) {
  const win = decodedPath.slice(1).replaceAll("/", "\\");
  return win[0].toLowerCase() + win.slice(1);
}
```

(The corresponding test in `reply-payloads.test.ts` covers only uppercase `C:` on both sides, so the case-mismatch scenario is currently untested.)

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 4e0ca4c

Comment on lines +117 to +124
return WINDOWS_ABSOLUTE_PATH_RE.test(trimmed) ? trimmed.replaceAll("/", "\\") : trimmed;
}
try {
const parsed = new URL(trimmed);
if (parsed.protocol === "file:") {
return decodeURIComponent(parsed.pathname || "");
const decodedPath = decodeURIComponent(parsed.pathname || "");
if (/^\/[a-zA-Z]:\//.test(decodedPath)) {
return decodedPath.slice(1).replaceAll("/", "\\");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive letter case not normalized

The normalization converts all variants to backslash form (e.g. C:\path) but does not lowercase the drive letter. This means file:///c:/tmp/photo.jpg (lowercase c) normalizes to c:\tmp\photo.jpg, which will not match a sentMediaUrls entry of C:\tmp\photo.jpg (uppercase C), even though on Windows these refer to the same file.

Windows paths are case-insensitive, so a sent record stored with one casing and a reply payload referencing the other casing will silently fail to dedupe.

Consider folding the drive letter to a consistent case before returning:

// non-file:// branch
return WINDOWS_ABSOLUTE_PATH_RE.test(trimmed)
  ? trimmed[0].toLowerCase() + trimmed.slice(1).replaceAll("/", "\\")
  : trimmed;
// file: branch
if (/^\/[a-zA-Z]:\//.test(decodedPath)) {
  const win = decodedPath.slice(1).replaceAll("/", "\\");
  return win[0].toLowerCase() + win.slice(1);
}

(The corresponding test in reply-payloads.test.ts covers only uppercase C: on both sides, so the case-mismatch scenario is currently untested.)

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/reply-payloads.ts
Line: 117-124

Comment:
**Drive letter case not normalized**

The normalization converts all variants to backslash form (e.g. `C:\path`) but does not lowercase the drive letter. This means `file:///c:/tmp/photo.jpg` (lowercase `c`) normalizes to `c:\tmp\photo.jpg`, which will **not** match a `sentMediaUrls` entry of `C:\tmp\photo.jpg` (uppercase `C`), even though on Windows these refer to the same file.

Windows paths are case-insensitive, so a sent record stored with one casing and a reply payload referencing the other casing will silently fail to dedupe.

Consider folding the drive letter to a consistent case before returning:

```typescript
// non-file:// branch
return WINDOWS_ABSOLUTE_PATH_RE.test(trimmed)
  ? trimmed[0].toLowerCase() + trimmed.slice(1).replaceAll("/", "\\")
  : trimmed;
```

```typescript
// file: branch
if (/^\/[a-zA-Z]:\//.test(decodedPath)) {
  const win = decodedPath.slice(1).replaceAll("/", "\\");
  return win[0].toLowerCase() + win.slice(1);
}
```

(The corresponding test in `reply-payloads.test.ts` covers only uppercase `C:` on both sides, so the case-mismatch scenario is currently untested.)

How can I resolve this? If you propose a fix, please make it concise.

@katoue

This comment was marked as spam.

@johnsonshi
Copy link
Copy Markdown
Contributor

The Windows media-dedupe fix itself looks good, and the regression test is solid.

  • In src/auto-reply/reply/reply-payloads.ts, the drive-letter normalization looks correct for the reported Windows case (file:///C:/... vs C:\\... / C:/...).
  • The new test in src/auto-reply/reply/reply-payloads.test.ts covers that behavior well.

That said, this PR also includes unrelated changes in:

  • apps/macos/Sources/OpenClawProtocol/GatewayModels.swift
  • apps/shared/OpenClawKit/Sources/OpenClawProtocol/GatewayModels.swift
  • src/config/schema.labels.ts

Those changes don’t appear tied to the reply-media dedupe fix and aren’t reflected in the PR title/summary. The branch history also shows they came in as a separate commit (fix(config): add push labels and sync protocol models), which reinforces that they should probably be split out.

I’d suggest trimming this PR to just the reply-media dedupe fix plus the regression test, or moving the config/protocol-model changes into a separate PR so this one stays tightly scoped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants