hat

Heterogeneous Accelerator Toolkit (HAT)

HAT is a toolkit that allows developers to express data-parallel applications in Java, optimize, offload and execute them on hardware accelerators.

Heterogeneous: a variety of devices and their corresponding programming languages.
Accelerator: GPUs, FPGA, CPUs, etc.
Toolkit: a set of libraries for Java developers.

HAT uses the code reflection API from the Project Babylon.

The toolkit offers:

An API for Kernel Programming on Accelerators from Java.
An API for Combining multiple kernels into a compute-graph.
An API for Java object mapping to hardware accelerators using Panama FFM.
An extensible backend system for multiple accelerators:
- OpenCL
- CUDA
- Java

Prerequisites

HAT currently requires Babylon JDK, which contains the code reflection APIs.
A base JDK >= 25. We currently use OpenJDK 26 for development.
A GPU SDK (one or more of the SDKs below) to be able to run on GPUs:
- An OpenCL implementation (e.g., Intel, Apple Silicon, CUDA SDK)
  - OpenCL >= 1.2
- CUDA SDK >= 12.9
cmake >= 3.22.1
gcc >= 12.0, or clang >= 17.0

Compatible systems

We actively develop and run tests on the following systems:

Apple Silicon M1-M4
Linux Fedora >= 42
Oracle Linux 10
Ubuntu >= 22.04

Quick Start

1. Build Babylon JDK

git clone https://github.com/openjdk/babylon
cd babylon
bash configure --with-boot-jdk=${JAVA_HOME}
make clean
make images

2. Update JAVA_HOME and PATH

export JAVA_HOME=<BABYLON-DIR>/build/macosx-aarch64-server-release/images/jdk
export PATH=$JAVA_HOME/bin:$PATH

3. Build HAT

sdk install jextract #if needed
cd hat
java @.bld

Done!

Run Examples

For instance, matrix-multiply:

java @.run ffi-opencl matmul --size=1024

Some examples have a GUI implementation:

java @.run ffi-opencl mandel

Full list of examples:

link

Run Unit-Tests

OpenCL backend:

java @.test-suite ffi-opencl

CUDA backed:

java @.test-suite ffi-cuda

Full Example Explained

The following example compute the square value of an input vector. The example is self-contained and it can be directly run with the java command.

Place the following code in the hat directory.

import hat.*;
import hat.Accelerator.Compute;
import hat.backend.*;
import hat.buffer.*;
import optkl.ifacemapper.MappableIface.*;
import jdk.incubator.code.Reflect;
import java.lang.invoke.MethodHandles;

public class ExampleHAT {

    // Kernel Code: This is the function to be offloaded to the accelerator (e.g.,
    // a GPU). The kernel will be executed by many GPU threads, in this case,
    // as many threads as elements in `array`.
    // The `kc` object can be used to obtain the thread identifier and map
    // the data element to process.
    // HAT kernels follow the SIMT programming model (Single Instruction Multiple Thread)
    // mode.
    // Kernel code is reflectable. Thus, the HAT runtime and HAT compiler can build
    // and optimize the code model. Once the code model is optimized, HAT generates
    // OpenCL/CUDA C99 code.
    @Reflect
    public static void squareKernel(@RO KernelContext kc, @RW S32Array array) {
        // HAT kernels support a reduced set of Java.
        // Kernels express the work to be done per thread (GPU/accelerator thread).
        if (kc.gix < array.length()) {
            int value = array.array(kc.gix);
            array.array(kc.gix, (value * value));
        }
    }

    // The following method represents the compute layer, in which we specify
    // the number of threads to be deployed on the accelerator. The number of threads
    // is specified in an ND-Range. An ND-Range could be 1D, 2D and 3D.
    // In this example, we launch 1D-range with the number of threads equal to
    // the input array size.
    @Reflect
    public static void square(@RO ComputeContext cc, @RW S32Array array) {
        var ndRange = NDRange.of1D(array.length());

        // Dispatch the kernel. The HAT runtime will offload the kernels
        // reached from this point and run the generated GPU kernels on the
        // target accelerator.
        // Furthermore, HAT automatically transfers data to the accelerator.
        // This is a blocking call, and when it returns control to the main
        // Java thread, results (outputs) are available to be consumed.
        cc.dispatchKernel(ndRange, kc -> squareKernel(kc, array));
    }

    static void main(String[] args) {
        final int size = 4096;

        // Create a new accelerator object
        var accelerator = new Accelerator(MethodHandles.lookup(), Backend.FIRST);

        // Instantiate an array on the target accelerator.
        // Data is stored off-heap using the Panama FFM API.
        var array = S32Array.create(accelerator, size);

        // Data initialization
        for (int i = 0; i < array.length(); i++) {
            array.array(i, i);
        }

        // Offload and dispatch of the compute-graph on the target accelerator.
        // This is a blocking call. Once this call finalizes, the results (outputs)
        // will be available to consume by the current Java thread.
        accelerator.compute((@Reflect Compute) cc -> ExampleHAT.square(cc, array));

        // Test result
        boolean isCorrect = true;
        for (int i = 0; i < size; i++) {
            if (array.array(i) != i * i) {
                isCorrect = false;
            }
        }
        if (isCorrect) {
            IO.println("Result is correct");
        } else {
            IO.println("Result is NOT correct");
        }
    }
}

Run this example in the babylon/hat directory. If you run from another directory, update the --class-path parameter accordingly. Use the java version built with the Babylon JDK.

java --enable-preview \
   --add-modules=jdk.incubator.code \
   --enable-native-access=ALL-UNNAMED \
   --class-path build/hat-optkl-1.0.jar:build/hat-core-1.0.jar:build/hat-backend-ffi-shared-1.0.jar:build/hat-backend-ffi-opencl-1.0.jar \
   -Djava.library.path=/Users/juanfumero/repos/babylon/hat/build \
   ExampleHAT

If you run with HAT=INFO you can see which accelerator was used:

$ HAT=INFO java --enable-preview ... ExampleHAT.java

[INFO] Config Bits = 8000
[INFO] Platform :"Apple"
[INFO]   Version      :"OpenCL 1.2 (Jan 16 2026 07:22:26)"
[INFO]   Name         :"Apple"
[INFO]   Device Type  : GPU  4
[INFO] OpenCLBackend::OpenCLQueue::dispatch
[INFO] numDimensions: 1
[INFO] GLOBAL [4096,1,1]
[INFO] LOCAL  [ nullptr ] // The driver will setup a default value

Result is correct

Documentation

Visit the docs folder.

Contributing

Contributions are welcome. Please see the OpenJDK Developers' Guide.

Development Workflow

Fork the repository
Create a feature branch: git checkout -b <branch>
Commit with clear messages
Run formatting and tests:
1. For OpenCL: java @.est-suite ffi-opencl
2. For CUDA: java @.test-suite ffi-cuda
Submit a pull request

Contacts/Questions

You can interact, provide feedback and ask questions using the babylon-dev mailing list.

Name		Name	Last commit message	Last commit date
parent directory ..
backends		backends
core		core
docs		docs
examples		examples
extractions		extractions
hat		hat
intellij		intellij
optkl		optkl
scripts		scripts
tests		tests
tools		tools
wraps		wraps
.bld		.bld
.cl		.cl
.clean		.clean
.dot		.dot
.exp		.exp
.gitignore		.gitignore
.help		.help
.optkl		.optkl
.run		.run
.sanity		.sanity
.test		.test
.test-suite		.test-suite
.tool		.tool
.toy		.toy
README.md		README.md
blue.java		blue.java
env.bash		env.bash
gradient.java		gradient.java
hat.java		hat.java
maven-build.bash		maven-build.bash
pom.xml		pom.xml
sanity		sanity

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Heterogeneous Accelerator Toolkit (HAT)

Prerequisites

Compatible systems

Quick Start

1. Build Babylon JDK

2. Update JAVA_HOME and PATH

3. Build HAT

Run Examples

Run Unit-Tests

Full Example Explained

Documentation

Contributing

Development Workflow

Contacts/Questions

FilesExpand file tree

hat

Directory actions

More options

Directory actions

More options

Latest commit

History

hat

Folders and files

parent directory

README.md

Heterogeneous Accelerator Toolkit (HAT)

Prerequisites

Compatible systems

Quick Start

1. Build Babylon JDK

2. Update JAVA_HOME and PATH

3. Build HAT

Run Examples

Run Unit-Tests

Full Example Explained

Documentation

Contributing

Development Workflow

Contacts/Questions