epub

package module
v0.2.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 7, 2025 License: MIT Imports: 30 Imported by: 0

README

Go EPUB Library

A Go library for reading and writing EPUB publications. This library follows the EPUB 3.3 specification: https://www.w3.org/TR/epub-33/

Go Reference Ko-fi


Features

  • Parse EPUB 3.3 container, metadata, manifest, spine, guide, and navigation structures.
  • Read content documents in:
    • XHTML
    • Markdown (auto-converted)
    • SVG
  • Extract images in raw bytes or image.Image.
  • Access package-level metadata and renditions.
  • Generate new EPUB files programmatically.

Installation

go get github.com/yourusername/epub

Quick Start (Reading)

package main

import (
	"fmt"
	"log"
	"github.com/yourusername/epub"
)

func main() {
	r, err := epub.OpenReader("example.epub")
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println("Title:", r.Title())
	fmt.Println("Author:", r.Author())
	fmt.Println("Language:", r.Language())

	ids := r.ListContentDocumentIds()
	for _, id := range ids {
		html := r.ReadContentHTMLById(id)
		fmt.Printf("HTML Node for %s: %v\n", id, html)
	}
}

Quick Start (Writing)

package main

import (
	"log"
	"time"
	"github.com/yourusername/epub"
)

func main() {
	w := epub.New("pub-id-001")
	w.Title("Example Book")
	w.Author("John Doe")
	w.Language("en")
	w.Date(time.Now())

	w.AddContent("chapter1.xhtml", []byte(`<html><body><h1>Hello World</h1></body></html>`))
	w.Write("output.epub")
}

Reader API Overview

type Reader

func NewReader(b []byte) (reader Reader, err error)
func OpenReader(name string) (reader Reader, err error)

func (r *Reader) Title() string
func (r *Reader) Author() string
func (r *Reader) Identifier() string
func (r *Reader) Language() string
func (r *Reader) Metadata() map[string]any

func (r *Reader) ListContentDocumentIds() []string
func (r *Reader) ListImageIds() []string

func (r *Reader) ReadContentHTMLById(id string) *html.Node
func (r *Reader) ReadContentMarkdownById(id string) string

func (r *Reader) ReadImageById(id string) *image.Image
func (r *Reader) ImageResources() map[string][]byte

func (r *Reader) Spine() []PublicationResource
func (r *Reader) Resources() []PublicationResource

func (r *Reader) TableOfContents() (TOC, error)

Writer API Overview

type Writer

func New(pubId string) *Writer

func (w *Writer) Title(...string)
func (w *Writer) Author(string)
func (w *Writer) Languages(...string)
func (w *Writer) Date(time.Time)
func (w *Writer) Description(string)
func (w *Writer) Publisher(string)

func (w *Writer) AddContent(filename string, content []byte) PublicationResource
func (w *Writer) AddImage(name string, content []byte) PublicationResource
func (w *Writer) AddSpineItem(res PublicationResource)

func (w *Writer) Write(filename string) error

License

MIT License.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Epub

type Epub struct {
	// contains filtered or unexported fields
}

Epub represents a full EPUB publication, including metadata, package information, resources, and navigation.

func (*Epub) DefaultPackage added in v0.2.0

func (epub *Epub) DefaultPackage() *pkg.Package

func (*Epub) SelectPackage added in v0.2.0

func (epub *Epub) SelectPackage(name string) *pkg.Package

func (*Epub) SelectedPackage added in v0.2.0

func (epub *Epub) SelectedPackage() *pkg.Package

type PublicationResource

type PublicationResource struct {
	ID         string
	Href       string
	MIMEType   string
	Content    []byte
	Filepath   string
	Properties pkg.ManifestProperty
}

PublicationResource represents a single resource inside the EPUB container. Resources may include XHTML documents, images, stylesheets, SVG files, and auxiliary data referenced by the publication manifest.

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader provides an interface for reading and accessing EPUB publication data. It offers methods for retrieving metadata, navigation structures, content documents, resources, and images.

func NewReader

func NewReader(b []byte) (reader Reader, err error)

NewReader creates a new Reader instance from a raw EPUB byte slice. The byte slice must represent a valid EPUB container.

func OpenReader

func OpenReader(name string) (reader Reader, err error)

OpenReader opens an EPUB file from the provided file path and returns a Reader instance. The file must exist and be a valid EPUB container.

func (*Reader) Author

func (r *Reader) Author() (author string)

Author returns the author (creator) metadata of the publication.

func (*Reader) ContentDocumentMarkdown

func (r *Reader) ContentDocumentMarkdown() (documents map[string]string)

ContentDocumentMarkdown returns content documents converted into Markdown form. The returned map is keyed by EPUB manifest item ID.

func (*Reader) ContentDocumentSVG

func (r *Reader) ContentDocumentSVG() (documents map[string]*html.Node)

ContentDocumentSVG returns SVG content documents parsed into html.Node trees. The returned map is keyed by EPUB manifest item ID.

func (*Reader) ContentDocumentXHTML

func (r *Reader) ContentDocumentXHTML() (documents map[string]*html.Node)

ContentDocumentXHTML returns XHTML content documents parsed into html.Node trees. The returned map is keyed by EPUB manifest item ID.

func (*Reader) ContentDocumentXHTMLString

func (r *Reader) ContentDocumentXHTMLString() (documents map[string]string)

ContentDocumentXHTMLString returns XHTML content documents as raw strings. The returned map is keyed by EPUB manifest item ID.

func (*Reader) Cover

func (r *Reader) Cover() (cover *image.Image)

Cover returns the publication's cover image if present.

func (*Reader) CoverBytes added in v0.2.0

func (r *Reader) CoverBytes() (cover []byte, err error)

CoverBytes returns the raw byte representation of the cover image. An error is returned if the publication does not define a cover.

func (*Reader) CurrentSelectedPackage

func (r *Reader) CurrentSelectedPackage() *pkg.Package

CurrentSelectedPackage returns the currently active package rendition. EPUB publications may have multiple renditions.

func (*Reader) CurrentSelectedPackagePath

func (r *Reader) CurrentSelectedPackagePath() string

CurrentSelectedPackagePath returns the internal path to the currently selected package document.

func (*Reader) Description

func (r *Reader) Description() (description string)

Description returns the publication's description metadata if defined.

func (*Reader) Identifier added in v0.2.0

func (r *Reader) Identifier() (identifier string)

Identifier returns the primary identifier of the publication as declared in the package metadata (often equivalent to UID).

func (*Reader) ImageResources added in v0.2.0

func (r *Reader) ImageResources() (images map[string][]byte)

ImageResources returns all image resources in byte form, keyed by manifest ID. Useful when direct decoding to image.Image is not required.

func (*Reader) Images

func (r *Reader) Images() (images map[string]image.Image)

Images returns all image resources in the publication, keyed by manifest ID.

func (*Reader) Language added in v0.2.0

func (r *Reader) Language() (language string)

Language returns the primary language of the publication, as declared in the package metadata (dc:language).

func (*Reader) ListContentDocumentIds added in v0.1.2

func (r *Reader) ListContentDocumentIds() (ids []string)

ListContentDocumentIds returns the IDs of all content documents (XHTML/SVG) registered in the publication manifest.

func (*Reader) ListImageIds added in v0.1.2

func (r *Reader) ListImageIds() (ids []string)

ListImageIds returns the IDs of all image-based resources (e.g., PNG, JPEG, SVG) in the publication manifest.

func (*Reader) Metadata

func (r *Reader) Metadata() (metadata map[string]any)

Metadata returns the complete metadata block of the publication. The returned map may include standard as well as extended metadata fields.

func (*Reader) NavigationCenterExtended

func (r *Reader) NavigationCenterExtended() *ncx.NCX

NavigationCenterExtended returns the NCX navigation document (if available). This is primarily used for EPUB 2.x backward compatibility.

func (*Reader) ReadContentHTMLByHref added in v0.1.3

func (r *Reader) ReadContentHTMLByHref(href string) (doc *html.Node)

ReadContentHTMLByHref returns the content document associated with the given manifest href. The returned document is parsed into an html.Node tree.

func (*Reader) ReadContentHTMLById

func (r *Reader) ReadContentHTMLById(id string) (doc *html.Node)

ReadContentHTMLById returns the XHTML/HTML content document associated with the given manifest ID, parsed into an html.Node tree.

func (*Reader) ReadContentMarkdownById added in v0.1.1

func (r *Reader) ReadContentMarkdownById(id string) (md string)

ReadContentMarkdownById returns a Markdown string representation of the content document associated with the given manifest ID.

func (*Reader) ReadImageByHref

func (r *Reader) ReadImageByHref(href string) (img *image.Image)

ReadImageByHref returns the image resource referenced by the given href, if present in the manifest.

func (*Reader) ReadImageById

func (r *Reader) ReadImageById(id string) (img *image.Image)

ReadImageById returns the image resource associated with the given manifest ID.

func (*Reader) References added in v0.2.1

func (r *Reader) References() (references map[pkg.GuideReferenceType]*html.Node)

References returns the structural guide references defined in the package, such as "cover", "title-page", "toc", etc. The returned map is keyed by reference type and mapped to corresponding HTML content.

func (*Reader) Refines

func (r *Reader) Refines() (refines map[string]map[string][]string)

Refines returns metadata refinement relationships. The returned map is keyed by subject identifier, mapping to properties and their assigned values.

func (*Reader) Resources

func (r *Reader) Resources() (resources []PublicationResource)

Resources returns all publication resources declared in the manifest.

func (*Reader) SelectPackageRendition

func (r *Reader) SelectPackageRendition(rendition string)

SelectPackageRendition changes the active package rendition by its rendition identifier. Useful when multiple reading layouts are available.

func (*Reader) SelectResourceByHref

func (r *Reader) SelectResourceByHref(href string) (resource *PublicationResource)

SelectResourceByHref retrieves a resource referenced by its manifest href.

func (*Reader) SelectResourceById

func (r *Reader) SelectResourceById(id string) (resource *PublicationResource)

SelectResourceById retrieves a resource referenced by its manifest ID.

func (*Reader) Spine

func (r *Reader) Spine() (orderedResources []PublicationResource)

Spine returns publication's spines, ordered resources like table of contents.

func (*Reader) TableOfContents

func (r *Reader) TableOfContents() (toc TOC, err error)

TableOfContents returns the TOC version present (e.g., NAV or NCX). If both exist, behavior depends on publication version and priority rules.

func (*Reader) Title

func (r *Reader) Title() (title string)

Title returns the publication's title metadata.

func (*Reader) UID

func (r *Reader) UID() (identifier string)

UID returns the unique identifier of the publication.

func (*Reader) Version

func (r *Reader) Version() (version string)

Version returns the EPUB specification version of the publication.

type TOC added in v0.1.3

type TOC struct {
	Title string `json:"title,omitempty"`
	Href  string `json:"href,omitempty"`
	Items []TOC  `json:"items,omitempty"`
	// contains filtered or unexported fields
}

TOC represents the publication's table of contents in normalized form. The structure abstracts differences between NAV (EPUB 3) and NCX (EPUB 2) so higher-level code can work with a unified interface.

func (*TOC) JSON added in v0.1.3

func (t *TOC) JSON() (b []byte, err error)

JSON marshals the table of contents structure into JSON format. This is useful for external tools, logging, debugging, or serialization to other formats.

func (*TOC) ReadContentHTML added in v0.1.3

func (t *TOC) ReadContentHTML() (content *html.Node)

ReadContentHTML returns the content document associated with the currently selected table of contents entry. The returned document is parsed into an html.Node tree. Behavior depends on TOC internal navigation selection state.

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer provides an interface for constructing, modifying, and writing EPUB publications to disk or memory. Writer usage documentation is evolving.

func New added in v0.2.0

func New(pubId string) *Writer

New creates a new Writer with the given publication identifier. The identifier is assigned to the package metadata (dc:identifier).

func (*Writer) AddContent added in v0.2.0

func (w *Writer) AddContent(filename string, content []byte) (res PublicationResource)

AddContent adds a content file (such as XHTML or SVG) to the publication using the provided filename and raw bytes. Returns the created resource.

func (*Writer) AddContentFile added in v0.2.0

func (w *Writer) AddContentFile(name string) (res PublicationResource, err error)

AddContentFile adds a content file to the publication by reading the file from disk. Returns the created resource and any file access error.

func (*Writer) AddGuide added in v0.2.0

func (w *Writer) AddGuide(kind pkg.GuideReferenceType, href string, title string)

AddGuide adds a guide reference entry (e.g., "cover", "toc", "title-page") to the package metadata.

func (*Writer) AddImage added in v0.2.0

func (w *Writer) AddImage(name string, content []byte) (res PublicationResource)

AddImage adds an image resource from raw bytes to the publication.

func (*Writer) AddImageFile added in v0.2.0

func (w *Writer) AddImageFile(name string) (res PublicationResource)

AddImageFile adds an image resource to the publication by reading from disk.

func (*Writer) AddSpineItem added in v0.2.0

func (w *Writer) AddSpineItem(res PublicationResource)

AddSpineItem appends the given resource to the spine reading order.

func (*Writer) Author added in v0.2.0

func (w *Writer) Author(creator string)

Author sets the primary creator/author in the package metadata.

func (*Writer) Contributor added in v0.2.0

func (w *Writer) Contributor(kind string, contributor string)

Contributor adds a contributor entry of the specified role or type.

func (*Writer) Cover added in v0.2.0

func (w *Writer) Cover(cover []byte) (err error)

Cover sets the publication cover from a raw image byte slice.

func (*Writer) CoverFile added in v0.2.0

func (w *Writer) CoverFile(name string)

CoverFile sets the publication cover image by file path.

func (*Writer) CoverJPG added in v0.2.0

func (w *Writer) CoverJPG(cover image.Image) (err error)

CoverJPG sets the publication cover image from an image.Image encoded as JPEG.

func (*Writer) CoverPNG added in v0.2.0

func (w *Writer) CoverPNG(cover image.Image) (err error)

CoverPNG sets the publication cover image from an image.Image encoded as PNG

func (*Writer) Creator added in v0.2.0

func (w *Writer) Creator(id string, creator string)

Creator adds a creator with a specific identifier attribute to the metadata.

func (*Writer) Date added in v0.2.0

func (w *Writer) Date(date time.Time)

Date sets the publication date metadata.

func (*Writer) Description added in v0.2.0

func (w *Writer) Description(description string)

Description sets a short description or summary for the publication.

func (*Writer) Direction added in v0.2.0

func (w *Writer) Direction(dir string)

Direction sets the writing direction (ltr or rtl) used by the spine.

func (*Writer) DublinCores added in v0.2.0

func (w *Writer) DublinCores(keyVal map[string]string)

DublinCores sets multiple Dublin Core metadata fields at once.

func (*Writer) Identifiers added in v0.2.0

func (w *Writer) Identifiers(identifier ...string)

Identifiers adds one or more identifiers to the package metadata.

func (*Writer) Languages added in v0.2.0

func (w *Writer) Languages(language ...string)

Languages adds one or more language codes to the publication metadata.

func (*Writer) LongDescription added in v0.2.0

func (w *Writer) LongDescription(description string)

LongDescription sets an extended descriptive summary.

func (*Writer) Meta added in v0.2.0

func (w *Writer) Meta(meta pkg.Meta)

Meta adds a meta element to the package metadata as-is.

func (*Writer) MetaContent added in v0.2.0

func (w *Writer) MetaContent(keyVal map[string]string)

MetaContent adds metadata key/value entries that do not require refinements.

func (*Writer) MetaProperty added in v0.2.0

func (w *Writer) MetaProperty(id string, property string, value string)

MetaProperty adds a property-based metadata refinement entry.

func (*Writer) Publisher added in v0.2.0

func (w *Writer) Publisher(publisher string)

Publisher sets the publication publisher.

func (*Writer) Refines added in v0.2.0

func (w *Writer) Refines(refines string, property string, value string, otherAttributes ...pkg.Meta)

Refines applies a metadata refinement to an existing metadata item.

func (*Writer) Rights added in v0.2.0

func (w *Writer) Rights(rights string)

Rights sets the copyright or licensing information for the publication.

func (*Writer) SetContentDir added in v0.2.0

func (w *Writer) SetContentDir(dir string)

SetContentDir sets the directory used for storing content documents.

func (*Writer) SetImageDir added in v0.2.0

func (w *Writer) SetImageDir(dir string)

SetImageDir sets the directory used for storing image resources.

func (*Writer) SetTextDir added in v0.2.0

func (w *Writer) SetTextDir(dir string)

SetTextDir sets the directory used for text document organization.

func (*Writer) Subject added in v0.2.0

func (w *Writer) Subject(id string, subject string)

Subject adds a subject or theme classification to the publication.

func (*Writer) TableOfContents added in v0.2.0

func (w *Writer) TableOfContents(name string, toc TOC) (err error)

func (*Writer) Title added in v0.2.0

func (w *Writer) Title(title ...string)

Title sets one or more title entries in the metadata.

func (*Writer) Write added in v0.2.0

func (w *Writer) Write(filename string) (err error)

Write finalizes the EPUB structure and writes it to the specified filename.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL