Website to EPUB and PDF Converter

This application extracts text content from websites and converts it into both EPUB and PDF ebook formats. It was specifically designed to handle the frameset-based structure of wellsofgrace.com.

Features

Extracts text from frameset-based websites
Handles multiple chapters automatically
Converts content to EPUB format
Converts content to PDF format with Chinese font support
Supports Chinese text
Retries failed requests automatically
Cleans up content for better readability
Table of contents generation for PDF
Proper page formatting and styling

Requirements

Install the required dependencies:

pip3 install -r requirements.txt

Usage

EPUB Only

Extract from the default URL and create EPUB:

python3 extract_ebook.py

Custom URL and output file:

python3 extract_ebook.py "https://example.com/book/index.htm" "my_book.epub"

PDF Only

Extract from the default URL and create PDF:

python3 pdf_converter.py

Custom URL and output file:

python3 pdf_converter.py "https://example.com/book/index.htm" "my_book.pdf"

Both EPUB and PDF

Extract from the default URL and create both formats:

python3 convert_to_both.py

Custom URL and output name (will create .epub and .pdf files):

python3 convert_to_both.py "https://example.com/book/index.htm" "my_book"

Files

ebook_extractor.py - Main extraction class with all functionality
extract_ebook.py - Command-line interface for EPUB creation
pdf_converter.py - Command-line interface for PDF creation
convert_to_both.py - Command-line interface for both formats
requirements.txt - Python dependencies
debug_fetch.py - Debug script for testing website fetching

How It Works

Frameset Detection: Detects if the website uses framesets and extracts frame sources
Contents Frame: Identifies navigation frames and extracts chapter links
Chapter Processing: Fetches each chapter and extracts text content
EPUB Generation: Creates a properly formatted EPUB file with all chapters

Default Target

The application is preconfigured to extract content from: https://wellsofgrace.com/books/spiritual/rskndam/index.htm

This creates an EPUB of the book "《认识苦难的奥秘》" (Understanding the Mystery of Suffering).

Output

The application creates ebook files in two formats:

EPUB Format

Can be read with any standard ebook reader such as:

Apple Books
Adobe Digital Editions
Calibre
FBReader
And many others

PDF Format

Can be read with any PDF reader such as:

Adobe Acrobat Reader
Preview (macOS)
Built-in browser PDF viewers
Mobile PDF readers
And many others

Features of PDF Output

Proper Chinese font support
Table of contents
Chapter navigation
Professional formatting
Justified text alignment

Error Handling

The application includes robust error handling:

Network timeouts and retries
Empty content detection
Alternative content extraction methods
Graceful degradation for missing chapters

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitattributes		.gitattributes
README.md		README.md
convert_to_both.py		convert_to_both.py
debug_fetch.py		debug_fetch.py
ebook_extractor.py		ebook_extractor.py
extract_ebook.py		extract_ebook.py
extracted_book.epub		extracted_book.epub
pdf_converter.py		pdf_converter.py
readme_chinese.md		readme_chinese.md
requirements.txt		requirements.txt
认识苦难的奥秘.epub		认识苦难的奥秘.epub
认识苦难的奥秘.pdf		认识苦难的奥秘.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Website to EPUB and PDF Converter

Features

Requirements

Usage

EPUB Only

PDF Only

Both EPUB and PDF

Files

How It Works

Default Target

Output

EPUB Format

PDF Format

Features of PDF Output

Error Handling

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Website to EPUB and PDF Converter

Features

Requirements

Usage

EPUB Only

PDF Only

Both EPUB and PDF

Files

How It Works

Default Target

Output

EPUB Format

PDF Format

Features of PDF Output

Error Handling

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages