Skip to content

A high-performance HTML5 parser library with CSS selectors and DOM manipulation for the Ring programming language.

License

Notifications You must be signed in to change notification settings

ysdragon/ring-html

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RingHTML Logo

RingHTML

A high-performance HTML5 parser with CSS selectors and DOM manipulation for the Ring programming language.

RingHTML is a powerful Ring library for parsing and manipulating HTML documents. It provides a simple and intuitive API for extracting data, navigating the DOM tree, and building HTML programmatically.

This project is made possible by the blazing-fast Lexbor HTML5 parser.

✨ Features

  • High-Performance: Powered by Lexbor, one of the fastest HTML5 parsers available.
  • CSS Selectors: Full support for CSS selectors (#id, .class, tag, parent child, etc.).
  • DOM Navigation: Traverse parent, children, siblings, first/last child with ease.
  • DOM Manipulation: Create, modify, insert, and remove nodes programmatically.
  • Content Extraction: Extract text, HTML, innerHTML, and attributes from any element.
  • Unicode Support: Full international character support for multilingual content.

πŸš€ Getting Started

Follow these instructions to get the RingHTML library up and running on your system.

Prerequisites

  • Ring: Ensure you have Ring language version 1.24 or higher installed.

Installation

Install using Ring Package Manager (RingPM)

ringpm install ring-html from ysdragon

πŸ’» Usage

Parsing HTML and extracting data is straightforward. Here's a simple example:

# Load the RingHTML library
load "html.ring"

# Parse an HTML document
doc = new HTML('
<html>
<body>
    <h1 class="title">Welcome to RingHTML!</h1>
    <div class="content">
        <p>This is a <strong>powerful</strong> HTML parser.</p>
        <ul>
            <li><a href="/docs">Documentation</a></li>
            <li><a href="/examples">Examples</a></li>
        </ul>
    </div>
</body>
</html>
')

# Find elements using CSS selectors
title = doc.find("h1.title")[1]
see "Title: " + title.text() + nl

# Extract all links
links = doc.find("a")
for link in links
    see "Link: " + link.text() + " -> " + link.attr("href") + nl
next

For more advanced examples, see the examples/ directory.

πŸ“š API Reference

HTML Class

The main document parser class.

Method Description
new HTML(html) Parse HTML string and create document
find(selector) Find all elements matching CSS selector β†’ [HTMLNode, ...]
body() Get the <body> element β†’ HTMLNode
head() Get the <head> element β†’ HTMLNode
root() / html() Get the <html> root element β†’ HTMLNode
createNode(tagName) Create a new element β†’ HTMLNode
createTextNode(text) Create a new text node β†’ HTMLNode

HTMLNode Class

Represents a DOM node with full navigation and manipulation capabilities.

Content Extraction

Method Description
text() Get combined text content of node and children
html() Get outer HTML (includes the tag itself)
innerHTML() Get inner HTML (children only)
tag() Get tag name (e.g., "div", "p")

Attributes

Method Description
attr(name) Get attribute value
has_attr(name) Check if attribute exists β†’ bool
attributes() Get all attributes β†’ [[name, value], ...]
setAttribute(name, value) Set or update attribute
removeAttribute(name) Remove an attribute

Navigation

Method Description
find(selector) Find descendants matching selector
parent() Get parent node
children() Get all child nodes β†’ [HTMLNode, ...]
firstChild() Get first child element
lastChild() Get last child element
next_sibling() Get next sibling node
prev_sibling() Get previous sibling node

Manipulation

Method Description
appendChild(node) Append a child node
insertBefore(node) Insert node before this one
insertAfter(node) Insert node after this one
remove() Remove this node from DOM
setInnerText(text) Set text content, replacing all children
setInnerHTML(html) Set inner HTML, replacing all children

πŸ› οΈ Development

If you want to contribute to the development of RingHTML or build it from source, follow these steps.

Prerequisites

  • CMake: Version 3.16 or higher.
  • C Compiler: A C compiler compatible with your platform (e.g., GCC, Clang, MSVC).
  • Ring: You need to have the Ring language source code available on your machine.

Build Steps

  1. Clone the Repository:

    git clone https://github.com/ysdragon/ring-html.git --recursive
    cd ring-html
  2. Set the RING Environment Variable: Before running CMake, set the RING environment variable to point to the Ring language source code.

    • Windows (Command Prompt):
      set RING=X:\path\to\ring
    • Windows (PowerShell):
      $env:RING = "X:\path\to\ring"
    • Unix/Linux/macOS:
      export RING=/path/to/ring
  3. Configure with CMake:

    mkdir build
    cd build
    cmake ..
  4. Build the Project:

    cmake --build .

The compiled library will be placed in the lib/<os>/<arch> directory.

Platform Support

Platform Architectures
Linux amd64, arm64
macOS amd64, arm64
FreeBSD amd64, arm64
Windows x64, x86, ARM64

🀝 Contributing

Contributions are welcome! If you have ideas for improvements or have found a bug, please open an issue or submit a pull request.

πŸ“„ License

This project is licensed under the MIT License. See the LICENSE file for details.

About

A high-performance HTML5 parser library with CSS selectors and DOM manipulation for the Ring programming language.

Topics

Resources

License

Stars

Watchers

Forks