High-performance HTML to Markdown conversion powered by Rust. Ships as native bindings for Rust, Python, TypeScript/Node.js, Ruby, PHP, Go, Java, C#, Elixir, R, C (FFI), and WebAssembly with identical rendering across all runtimes.
Documentation | Live Demo | API Reference
- 150-280 MB/s throughput (10-80x faster than pure Python alternatives)
- 12 language bindings with consistent output across all runtimes
- Structured result —
convert()returnsConversionResultwithcontent,metadata,tables,images, andwarnings - Metadata extraction — title, headers, links, images, structured data (JSON-LD, Microdata, RDFa)
- Visitor pattern — custom callbacks for content filtering, URL rewriting, domain-specific dialects
- Table extraction — extract structured table data (cells, headers, rendered markdown) during conversion
- Secure by default — built-in HTML sanitization via ammonia
# Rust
cargo add html-to-markdown-rs
# Python
pip install html-to-markdown
# TypeScript / Node.js
npm install @kreuzberg/html-to-markdown-node
# Ruby
gem install html-to-markdown
# CLI
cargo install html-to-markdown-cli
# or
brew install kreuzberg-dev/tap/html-to-markdownSee the Installation Guide for all languages including PHP, Go, Java, C#, Elixir, R, and WASM.
convert() is the single entry point. It returns a structured ConversionResult:
# Python
from html_to_markdown import convert
result = convert("<h1>Hello</h1><p>World</p>")
print(result["content"]) # # Hello\n\nWorld
print(result["metadata"]) # title, links, headings, …// TypeScript / Node.js
import { convert } from "@kreuzberg/html-to-markdown-node";
const result = convert("<h1>Hello</h1><p>World</p>");
console.log(result.content); // # Hello\n\nWorld
console.log(result.metadata); // title, links, headings, …// Rust
use html_to_markdown_rs::convert;
let result = convert("<h1>Hello</h1><p>World</p>", None)?;
println!("{}", result.content.unwrap_or_default());| Language | Package | Install |
|---|---|---|
| Rust | html-to-markdown-rs | cargo add html-to-markdown-rs |
| Python | html-to-markdown | pip install html-to-markdown |
| TypeScript / Node.js | @kreuzberg/html-to-markdown-node | npm install @kreuzberg/html-to-markdown-node |
| WebAssembly | @kreuzberg/html-to-markdown-wasm | npm install @kreuzberg/html-to-markdown-wasm |
| Ruby | html-to-markdown | gem install html-to-markdown |
| PHP | kreuzberg-dev/html-to-markdown | composer require kreuzberg-dev/html-to-markdown |
| Go | htmltomarkdown | go get github.com/kreuzberg-dev/html-to-markdown/packages/go/v3 |
| Java | dev.kreuzberg:html-to-markdown | Maven / Gradle |
| C# | KreuzbergDev.HtmlToMarkdown | dotnet add package KreuzbergDev.HtmlToMarkdown |
| Elixir | html_to_markdown | mix deps.get html_to_markdown |
| R | htmltomarkdown | install.packages("htmltomarkdown") |
| C (FFI) | releases | Pre-built .so / .dll / .dylib |
html-to-markdown is developed by kreuzberg.dev and powers the HTML conversion pipeline in Kreuzberg, a document intelligence library for extracting text from PDFs, images, and office documents.
Contributions welcome! See CONTRIBUTING.md for setup instructions and guidelines.
MIT License — see LICENSE for details.