fix: sanitize SESSION_ID in precompact hook (closes #110)#589
fix: sanitize SESSION_ID in precompact hook (closes #110)#589JoeProAI wants to merge 3 commits intoMemPalace:developfrom
Conversation
web3guru888
left a comment
There was a problem hiding this comment.
Good catch, and the fix is the right approach. I have an overlapping PR (#320) open on the same file targeting the same issue, so I can speak to the tradeoffs directly.
What #320 does: post-extraction sanitization via tr -cd 'a-zA-Z0-9_-'. Simple, but it's a second pass after the unsanitized value is already held in shell memory.
What this PR does: inline sanitization inside the Python extraction call, using the same re.sub(r'[^a-zA-Z0-9_/.\\-~]', '', ...) pattern as mempal_save_hook.sh. This is architecturally cleaner — consistent between hooks, single-pass, and mirrors what the save hook already does.
One minor observation: the charset here (a-zA-Z0-9_/.\\-~) allows path separators and tilde, which is appropriate for TRANSCRIPT_PATH in the save hook but isn't strictly needed for SESSION_ID in the precompact hook (which only goes to a log line). Not a problem — a valid session ID with those characters will still work fine — just noting the rationale if anyone asks why the precompact charset is broader than necessary.
The fix is correct and complete. Happy to close #320 once this lands, since this approach is cleaner.
|
Good point — the broader charset is copied from the save hook for consistency, but happy to tighten it to [^a-zA-Z0-9_-]\ for SESSION_ID if maintainers prefer. Either way works. |
|
Either works for me too — the consistency argument is reasonable since anyone reading both hooks in parallel won't be surprised by the charset mismatch. If the maintainers want to tighten it, that's a one-line change. Either way, the real fix is the inline sanitization itself, which is what matters. |
…ation (MemPalace#589) Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
Also fixed in #562 — applied the same |
bensig
left a comment
There was a problem hiding this comment.
Good catch — the unsanitized SESSION_ID in the precompact hook needed fixing.
Two notes:
1. Tighten the regex for SESSION_ID
The [^a-zA-Z0-9_/.\\-~] charset is copied from the save hook where it's also used for TRANSCRIPT_PATH (a file path). For a session ID, /, ., and ~ aren't needed. [^a-zA-Z0-9_-] would be tighter and more appropriate here.
2. The save hook's eval pattern is the bigger risk
The save hook does:
eval $(echo "$INPUT" | python3 -c "
...
print(f'SESSION_ID=\"{safe(sid)}\"')
...")Even with safe(), eval on constructed strings is fragile — if the regex ever has a gap, it's direct shell injection. This PR's approach (simple command substitution via $(...)) is actually safer than what the save hook does. Worth a follow-up issue to refactor the save hook to use the same pattern.
|
Closing — this is superseded by recently merged PRs to develop. Thank you for the contribution! |
The \mempal_save_hook.sh\ already sanitizes values via a \safe()\ regex before use. This PR applies the same pattern to \mempal_precompact_hook.sh, where \SESSION_ID\ was extracted without sanitization and used directly in a shell string.\n\nFix: strip non-alphanumeric characters from \session_id\ before assignment, consistent with the approach in the save hook.