Skip to content

feat: markdown for html bodies#1213

Merged
LeaWhoCodes merged 3 commits intofloatpane:masterfrom
YeeP79:feature/skip-markdown-for-html-bodies
May 6, 2026
Merged

feat: markdown for html bodies#1213
LeaWhoCodes merged 3 commits intofloatpane:masterfrom
YeeP79:feature/skip-markdown-for-html-bodies

Conversation

@YeeP79
Copy link
Copy Markdown
Contributor

@YeeP79 YeeP79 commented May 1, 2026

What?

Threads MIME type detection from the fetch layer through to rendering. FetchEmailBody and the IMAP/JMAP/POP3 providers now return BodyMIMEType. The Backend interface, daemon RPC protocol, and local cache schema all carry the MIME type through. view.ProcessBody takes the MIME type and skips the markdownToHTML pre-pass when it's text/html. The TUI message fetcher and view call sites are updated for the new field.

Why?

Right now every email body gets run through md4c before we render it as HTML. The assumption was that md4c would pass raw HTML through untouched, but that falls apart on real HTML email — Datadog digests, marketing stuff, anything with heavy attributes or indentation trips md4c's html_block rules. When that happens, md4c either escapes the tags or treats them as code blocks, and the TUI ends up showing literal <table>, <tr>, <td> instead of the rendered content.
Detecting text/html and skipping the markdown pass fixes that, while text/plain senders like GitHub notifications and mailing lists keep their markdown formatting.

Trade-offs & compatibility

If a text/html sender writes **bold** inside their HTML, it won't render as bold anymore. That's spec-correct behavior, and it's worth losing to get tables back. On the compat side, MIME type defaults to an empty string in the wire format and cache, so old cache entries fall back to the legacy markdown path. No migration needed.

@YeeP79 YeeP79 requested a review from a team as a code owner May 1, 2026 23:49
Copy link
Copy Markdown
Member

@floatpanebot floatpanebot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @YeeP79! Please fix the following issues with your PR:

  • Title: Does not follow conventional commits (e.g., feat: added something, fix(core): resolved crash).
  • Body: Missing the ## What? or ## Why? headings required by the PR template.

@github-actions github-actions Bot added bug Something isn't working enhancement New feature or request labels May 1, 2026
@YeeP79 YeeP79 changed the title Feature/skip markdown for html bodies feat: markdown for html bodies May 1, 2026
Copy link
Copy Markdown
Member

@floatpanebot floatpanebot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @YeeP79! Please fix the following issues with your PR:

  • Body: Missing the ## What? or ## Why? headings required by the PR template.

@floatpanebot floatpanebot dismissed stale reviews from themself May 1, 2026 23:50

Formatting issues have been resolved. Thank you!

@andrinoff
Copy link
Copy Markdown
Member

Hello, I cannot replicate what it fixes, can you describe in detail and also provide an email (HTML body) that is formatted better with your change?

@YeeP79
Copy link
Copy Markdown
Contributor Author

YeeP79 commented May 2, 2026

I'll get some screenshots and create a test if possible that should fail on the old setup.

@andrinoff andrinoff removed the bug Something isn't working label May 2, 2026
@github-actions github-actions Bot added the bug Something isn't working label May 2, 2026
@andrinoff andrinoff removed the bug Something isn't working label May 2, 2026
@YeeP79 YeeP79 force-pushed the feature/skip-markdown-for-html-bodies branch from 5983e6d to 5518618 Compare May 2, 2026 21:13
@floatpanebot floatpanebot added area/daemon Daemon / RPC area/fetcher IMAP fetch / IDLE / search area/notify Notifications area/sender SMTP send path area/tui Terminal UI / view layer bug Something isn't working labels May 3, 2026
@andrinoff
Copy link
Copy Markdown
Member

hey, you haven't sent an HTML body that is rendered better (you can screenshot side-by-side)

@YeeP79
Copy link
Copy Markdown
Contributor Author

YeeP79 commented May 3, 2026

Sorry for the hold up, my daughter is graduating college this weekend. It has been crazy for us. Monday before work I will knock out any issues you are seeing in either PR.

@andrinoff
Copy link
Copy Markdown
Member

Sorry for the hold up, my daughter is graduating college this weekend. It has been crazy for us. Monday before work I will knock out any issues you are seeing in either PR.

No worries, man, take your time, congrats on the graduation!

@YeeP79
Copy link
Copy Markdown
Contributor Author

YeeP79 commented May 3, 2026

Side note on the screenshots and raw HTML: I was seeing the issues mostly in emails from my work so I need to be kind of selective on what I show. I'll make sure the unit test I create makes it crystal clear if you run it with and without the PR.

@andrinoff
Copy link
Copy Markdown
Member

Side note on the screenshots and raw HTML: I was seeing the issues mostly in emails from my work so I need to be kind of selective on what I show. I'll make sure the unit test I create makes it crystal clear if you run it with and without the PR.

i just want to see the change, because in my inbox in the last 50 emails, i do not have a single one that is loaded differently with your fix, you can redact the information from the screenshots

@andrinoff andrinoff force-pushed the feature/skip-markdown-for-html-bodies branch from 5518618 to 33488d1 Compare May 4, 2026 07:58
@floatpanebot floatpanebot added the area/config Configuration / settings label May 4, 2026
@andrinoff
Copy link
Copy Markdown
Member

updated the branch with rebase

The renderer pipeline ran every email body through markdownToHTML before
parsing as HTML, on the assumption that CommonMark's html_block rule
would pass raw HTML through unchanged. In practice md4c's html_block
shape requirements (no leading whitespace, narrow opening-tag patterns)
are routinely violated by real-world HTML emails — Datadog, marketing
tools, and any sender with attribute-heavy <table style="..."> markup.
md4c falls back to escaping or to code-block treatment, leaking literal
tags into the rendered output.

Threads a body MIME type through the whole stack:
  fetcher.Email                    new BodyMIMEType field
  fetcher.FetchEmailBodyFromMailbox + 4 wrappers   return the type
  backend.EmailReader.FetchEmailBody              interface change
  imap/jmap/pop3 provider implementations         return the type
  daemonclient.Service.FetchEmailBody             interface change
  daemonrpc.FetchEmailBodyResult                  new BodyMIMEType field
  tui.EmailBodyFetchedMsg, PreviewBodyFetchedMsg  carry the field
  view.ProcessBody, ProcessBodyWithInline         new mimeType param

When mimeType == "text/html", the renderer hands the body straight to
htmlconv. Empty mimeType keeps the legacy markdown-first path, so cached
emails fetched before this change render unchanged.

Trade-off: HTML bodies that contain literal markdown syntax (e.g. a
poorly-templated newsletter with **asterisks** inside <p> tags) no
longer get markdown rendering. This matches the spec — Content-Type:
text/html means the body is HTML — and the regression target is rare
versus the much more common Datadog-style HTML that this fixes.

Drive-by: PGP-decrypted plain-text bodies were tagging themselves as
htmlPartID = "decrypted"; now correctly tagged plainPartID. No
behavior change pre-this-PR (extractedBody was always returned), but
required for the new MIME type tracking to be accurate.

Doesn't fix floatpane#602 (htmlconv tag coverage), but resolves the related
"raw <table> tags visible in the viewer" symptom.
@YeeP79
Copy link
Copy Markdown
Contributor Author

YeeP79 commented May 6, 2026

@andrinoff

You'll see two tests in view/html_test.go that share the same input (datadogShapeHTML — a Datadog Daily Digest-shaped table with 4-space indented attribute-heavy HTML):

  • TestProcessBody_LegacyPathManglesIndentedHTML — calls ProcessBody with an empty MIME type (the path master takes for every email today) and asserts the rendered output literally contains <table cellpadding=...>. This pins the bug: md4c's html_block rule rejects indented HTML with attributes, so the markdown pre-pass leaks raw tags through to the viewer.
  • TestProcessBody_HTMLMIMETypeSkipsMarkdownPrepass — same input, but with BodyMIMETypeHTML. Asserts the literal <table does not appear in the output. This pins the fix.

Both pass on this branch. If you cherry-pick the legacy assertion onto master and remove the mimeType parameter, the fix-side test cannot exist on master because the parameter doesn't exist there — the contract change is the fix. Run the same command from master with just the legacy test (s/ProcessBody(..., "", ...)/ProcessBody(...)/) and it passes there too, confirming master takes the broken path unconditionally.

Without this, cached bodies always re-render via the legacy markdown→HTML
pre-pass even on builds that have the MIME-type-aware renderer, because
the cache layer drops the type when serializing CachedEmailBody.

Field is omitempty so existing cache rows continue to load (with empty
MIME type, which the renderer treats as 'unknown' and falls back to
legacy behavior). New rows written after this change are tagged
correctly and refresh into proper rendering on next open.
@YeeP79 YeeP79 force-pushed the feature/skip-markdown-for-html-bodies branch from 33488d1 to 221fd40 Compare May 6, 2026 01:27
@andrinoff
Copy link
Copy Markdown
Member

/build

Comment thread fetcher/fetcher.go Outdated
body, _ := io.ReadAll(p.Body)
extractedBody = string(body)
extractedBodyMIMEType = "text/plain"
htmlPartID = "decrypted"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this htmlPartID? this will result in all emails being treated as text/html

@floatpanebot
Copy link
Copy Markdown
Member

Build complete (221fd40)

Warning

This is an unreviewed PR build and has not been security audited. It may contain bugs, vulnerabilities, or malicious code. Do not use for daily use. Only use for testing purposes.

OS Arch Download
Linux amd64 matcha_preview_linux_amd64.tar.gz
Linux arm64 matcha_preview_linux_arm64.tar.gz
macOS amd64 matcha_preview_darwin_amd64.tar.gz
macOS arm64 matcha_preview_darwin_arm64.tar.gz
Windows amd64 matcha_preview_windows_amd64.zip
Windows arm64 matcha_preview_windows_arm64.zip

Copy link
Copy Markdown
Member

@andrinoff andrinoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Copy Markdown
Member

@LeaWhoCodes LeaWhoCodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@LeaWhoCodes LeaWhoCodes merged commit c11de45 into floatpane:master May 6, 2026
13 checks passed
@andrinoff andrinoff removed the bug Something isn't working label May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Configuration / settings area/daemon Daemon / RPC area/fetcher IMAP fetch / IDLE / search area/notify Notifications area/sender SMTP send path area/tui Terminal UI / view layer enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants