Overview

Relevant source files

Purpose and Scope

This document provides a high-level introduction to Apache HugeGraph, a distributed graph database system. It covers the system's purpose, architecture, key features, deployment modes, and major components. For detailed information about specific subsystems, refer to the following child pages:

Key Features: Detail the main features including OLTP support, multi-language query support (Gremlin/Cypher), and pluggable backends.
System Requirements: Specify technical requirements including Java version, Maven version, and hardware recommendations.
Ecosystem Components: Overview of major components (Server, PD, Store, Commons, Struct) and their interactions.

Sources: README.md23-31 pom.xml28-31

What is Apache HugeGraph?

Apache HugeGraph is a fast-speed and highly-scalable graph database that supports more than 100 billion data entities with high performance and scalability. It is an OLTP (Online Transaction Processing) graph engine compliant with the Apache TinkerPop 3 framework, enabling complex graph queries through the Gremlin graph traversal language.

Key Characteristics:

Scalability: Supports 100+ billion vertices and edges with horizontal scaling capabilities README.md5-6
Multi-Language: Native Gremlin support (TinkerPop 3.5) and OpenCypher implementation README.md37-38
Pluggable Storage: Abstracted backend interface supporting RocksDB, HStore, and others README.md34
Distributed Architecture: Advanced distributed deployment utilizing Raft consensus for high availability hugegraph-store/README.md15-16
Apache Ecosystem: An Apache Software Foundation incubator project licensed under Apache License 2.0 pom.xml40-46

Sources: README.md25-28 pom.xml28-31 hugegraph-store/README.md8-20

System Architecture

Apache HugeGraph is designed with a layered architecture that bridges high-level query languages to low-level distributed storage.

High-Level Component Mapping

The following diagram associates natural language system components with their specific code entities and identifiers within the codebase.

Diagram: Bridging Natural Language Space to Code Entity Space

Sources: README.md59-99 hugegraph-store/README.md59-73 hugegraph-store/docs/development-guide.md93-109

Maven Module Organization

The project is organized as a multi-module Maven project. The root pom.xml defines the primary modules that constitute the ecosystem.

Module	Description	Code Location
`hugegraph-server`	Core graph engine, REST API, and Gremlin/Cypher support	hugegraph-server/
`hugegraph-pd`	Placement Driver for distributed metadata and partition management	hugegraph-pd/
`hugegraph-store`	Distributed storage engine with Raft consensus	hugegraph-store/
`hugegraph-commons`	Shared utilities, RPC, and common configurations	hugegraph-commons/
`hugegraph-struct`	Common data structures (Schema, Vertex, Edge)	hugegraph-struct/

Sources: pom.xml101-109 README.md108-116

Deployment Modes

HugeGraph supports two primary deployment modes to cater to different scales and reliability requirements. For detailed guides, see Deployment.

Deployment Comparison

Mode	Components	Use Case	Data Scale	High Availability
Standalone	Server + RocksDB (embedded)	Development, Testing	< 1TB	Basic
Distributed	Server + PD (3-5 nodes) + Store (3+ nodes)	Production, HA	< 1000 TB	Yes (Raft)

Diagram: Deployment Mode Code Associations

Sources: README.md101-107 hugegraph-store/README.md28-40

Ecosystem and Governance

Apache HugeGraph includes a comprehensive toolchain for data management and analysis:

hugegraph-toolchain: Includes Loader (import), Dashboard (Hubble visualization), and Client SDKs README.md43-47
hugegraph-computer: An integrated graph computing system for OLAP workloads README.md49
hugegraph-ai: Integration for LLM and Knowledge Graph workflows README.md51

The project is governed by the Apache Software Foundation, ensuring open development and license compliance pom.xml2-16 Automated CI/CD pipelines handle license checking and code quality README.md10-11

Sources: README.md39-53 .asf.yaml1-26 .github/workflows/licence-checker.yml1-32

Next Steps

To explore the specific capabilities of the engine, visit Key Features.
To prepare your environment for installation, see System Requirements.
To understand how modules like hg-store-node and PartitionService interact, see Ecosystem Components.

Overview

Relevant source files

Purpose and Scope

Key Features: Detail the main features including OLTP support, multi-language query support (Gremlin/Cypher), and pluggable backends.
System Requirements: Specify technical requirements including Java version, Maven version, and hardware recommendations.
Ecosystem Components: Overview of major components (Server, PD, Store, Commons, Struct) and their interactions.

Sources: README.md23-31 pom.xml28-31

What is Apache HugeGraph?

Key Characteristics:

Scalability: Supports 100+ billion vertices and edges with horizontal scaling capabilities README.md5-6
Multi-Language: Native Gremlin support (TinkerPop 3.5) and OpenCypher implementation README.md37-38
Pluggable Storage: Abstracted backend interface supporting RocksDB, HStore, and others README.md34
Distributed Architecture: Advanced distributed deployment utilizing Raft consensus for high availability hugegraph-store/README.md15-16
Apache Ecosystem: An Apache Software Foundation incubator project licensed under Apache License 2.0 pom.xml40-46

Sources: README.md25-28 pom.xml28-31 hugegraph-store/README.md8-20

System Architecture

Apache HugeGraph is designed with a layered architecture that bridges high-level query languages to low-level distributed storage.

High-Level Component Mapping

The following diagram associates natural language system components with their specific code entities and identifiers within the codebase.

Diagram: Bridging Natural Language Space to Code Entity Space

Sources: README.md59-99 hugegraph-store/README.md59-73 hugegraph-store/docs/development-guide.md93-109

Maven Module Organization

The project is organized as a multi-module Maven project. The root pom.xml defines the primary modules that constitute the ecosystem.

Module	Description	Code Location
`hugegraph-server`	Core graph engine, REST API, and Gremlin/Cypher support	hugegraph-server/
`hugegraph-pd`	Placement Driver for distributed metadata and partition management	hugegraph-pd/
`hugegraph-store`	Distributed storage engine with Raft consensus	hugegraph-store/
`hugegraph-commons`	Shared utilities, RPC, and common configurations	hugegraph-commons/
`hugegraph-struct`	Common data structures (Schema, Vertex, Edge)	hugegraph-struct/

Sources: pom.xml101-109 README.md108-116

Deployment Modes

HugeGraph supports two primary deployment modes to cater to different scales and reliability requirements. For detailed guides, see Deployment.

Deployment Comparison

Mode	Components	Use Case	Data Scale	High Availability
Standalone	Server + RocksDB (embedded)	Development, Testing	< 1TB	Basic
Distributed	Server + PD (3-5 nodes) + Store (3+ nodes)	Production, HA	< 1000 TB	Yes (Raft)

Diagram: Deployment Mode Code Associations

Sources: README.md101-107 hugegraph-store/README.md28-40

Ecosystem and Governance

Apache HugeGraph includes a comprehensive toolchain for data management and analysis:

hugegraph-toolchain: Includes Loader (import), Dashboard (Hubble visualization), and Client SDKs README.md43-47
hugegraph-computer: An integrated graph computing system for OLAP workloads README.md49
hugegraph-ai: Integration for LLM and Knowledge Graph workflows README.md51

Sources: README.md39-53 .asf.yaml1-26 .github/workflows/licence-checker.yml1-32

Next Steps

To explore the specific capabilities of the engine, visit Key Features.
To prepare your environment for installation, see System Requirements.
To understand how modules like hg-store-node and PartitionService interact, see Ecosystem Components.

Overview

Purpose and Scope

What is Apache HugeGraph?

System Architecture

High-Level Component Mapping

Maven Module Organization

Deployment Modes

Deployment Comparison

Ecosystem and Governance

Next Steps

On this page

Overview

Purpose and Scope

What is Apache HugeGraph?

System Architecture

High-Level Component Mapping

Maven Module Organization

Deployment Modes

Deployment Comparison

Ecosystem and Governance

Next Steps

On this page