GoMLX, an Accelerated ML and Math Framework

📖 About GoMLX

GoMLX is an easy-to-use set of Machine Learning and generic math libraries and tools. It can be seen as a PyTorch/Jax/TensorFlow for Go.

It can be used to train, fine-tune, modify, and combine machine learning models. It provides all the tools to make that work easy: from a complete set of differentiable operators, all the way to UItools to plot metrics while training in a notebook.

It runs almost everywhere Go runs, using a pure Go backend. It runs even in the browser with WASM (see demo created with GoMLX). Likely, it will work in embedded devices as well (see Tamago).

It also supports a very optimized backend engine based on OpenXLA that uses just-in-time compilation to CPU, GPUs (Nvidia, and likely AMD ROCm, Intel, Macs) and Google's TPUs. It also supports modern distributed execution (new, still being actively improved) for multi-TPU or multi-GPU using XLA Shardy, an evolution of the GSPMD distribution).

It's the same engine that powers Google's Jax, TensorFlow and Pytorch/XLA, and it has the same speed in many cases. Use this backend to train large models or with large datasets.

Tip

See our 🎓 tutorial 🎓
See Eli Bendersky's blog post "GoMLX: ML in Go without Python", (a bit outdated, but still useful)
A guided example for Kaggle Dogs Vs Cats.
A simple GoMLX slide deck with small sample code.

It was developed to be a full-featured ML platform for Go, productionizable and easily to experiment with ML ideas —see Long-Term Goals below.

It strives to be simple to read and reason about, leading the user to a correct and transparent mental model of what is going on (no surprises)—aligned with Go philosophy. At the cost of more typing (more verbose) at times.

It is also incredibly flexible and easy to extend and try non-conventional ideas: use it to experiment with new optimizer ideas, complex regularizers, funky multitasking, etc.

Documentation is kept up to date (if it is not well-documented, it is as if the code is not there), and error messages are useful (always with a stack-trace) and try to make it easy to solve issues.

🗺️ Overview

GoMLX is a full-featured ML framework, supporting various well-known ML components
from the bottom to the top of the stack. But it is still only a slice of what a major ML library/framework should provide (like TensorFlow, Jax, or PyTorch).

Examples developed using GoMLX:

Adult/Census model;
How do KANs learn ?;
Cifar-10 demo;
MNIST demo (library and command-line only)
Dogs & Cats classifier demo;
IMDB Movie Review demo;
Diffusion model for Oxford Flowers 102 dataset (generates random flowers);
- Flow Matching Study Notebook based on Meta's "Flow Matching Guide and Code".
GoMLX/Gemma, a GoMLX implementation of Google DeepMind's Gemma v2 model (blog post)
GNN model for OGBN-MAG (experimental).
Last, a trivial synthetic linear model, for those curious to see a barebones simple model.
Neural Style Transfer 10-year Celebration: see a demo written using GoMLX of the original paper.
Triplet Losses: various negative sampling strategies as well as various distance metrics.
AlphaZero AI for the game of Hive: it uses a trivial GNN to evaluate positions on the board. It includes a WASM demo (runs GoMLX in the browser!) and a command-line UI to test your skills!

Highlights:

🚀 NEW 🚀: Auto-installation of XLA PJRT plugins (for CPU, GPU and TPUs; Linux and Macs) in the user'slocal lib directory ($HOME/.local/lib in Linux and $HOME/Library/Application Support/XLA in Mac). It can be disabled by setting GOMLX_NO_AUTO_INSTALL or programmatically by calling xla.EnableAutoInstall(false)).

🚀 NEW 🚀: Distributed Execution (across multiple GPUs or TPUs) with little hints from the user. One only needs to configure a distributed dataset, and the trainer picks up from there. See code change in UCI-Adult demo

🚀 NEW 🚀: Fixed Mac support for the XLA backend, including installer. Only CPU for now, see jax/issues/32800 for the request to Apple developers to update their support for GPU XLA.

Converting ONNX models to GoMLX with onnx-gomlx: both as an alternative for onnxruntime (leveraging XLA), but also to further fine-tune models. See also go-huggingface to easily download ONNX model files from HuggingFace.
Docker "gomlx_jupyterlab" with integrated JupyterLab and GoNB (a Go kernel for Jupyter notebooks)
Two backends:
1. xla: OpenXLA backend for CPUs, GPUs, and TPUs. State-of-the-art as these things go. Only linux/amd64 for now. Using the go-xla Go version of the APIs.
2. go: a pure Go backend (no C/C++ dependencies): slower but very portable (compiles to WASM/Windows/etc.): SIMD support is planned when it becomes available; See also GoMLX compiled to WASM to power the AI for a game of Hive
Autodiff: automatic differentiation—only gradients for now, no jacobian.
Context: automatic variable management for ML models.
ML layers library with some of the most popular machine learning "layers": FFN layers,
various activation functions, layer and batch normalization, convolutions, pooling, dropout, Multi-Head-Attention (for transformer layers), LSTM, KAN (B-Splines, GR-KAN/KAT networks, Discrete-KAN, PiecewiseLinear KAN), PiecewiseLinear (for calibration and normalization), various regularizations, FFT (reverse/differentiable), learnable rational functions (both for activations and GR-KAN/KAT networks), VNN (Vector Neural Networks) for SO(3)-Equivariant/Invariant layers, etc.
Training library, with some pretty-printing. Including plots for Jupyter notebook, using GoNB, a Go Kernel.
- Also, various debugging tools: collecting values for particular nodes for plotting, simply logging the value of nodes during training, stack-trace of the code where nodes are created.
gomlx_checkpoints, the command line tool to inspect checkpoints of train(-ing) models, generate plots with loss and arbitrary evaluation metrics using Plotly. See example of training session, with the effects of a learning rate change during the training. It also allows plotting different models together, to compare their evolution.
SGD and Adam (AdamW and Adamax) optimizers.
Various losses and metrics.
Pre-Trained models to use: InceptionV3 (image model), many more from HuggingFace using onnx-gomlx. See also go-huggingface to easily download ONNX model files from HuggingFace.
Read Numpy arrays into GoMLX tensors -- see package github.com/gomlx/gomlx/pkg/core/tensors/numpy.
(Experimental) Support static linking of PJRT: slower to build the Go program, but deploying it doesn't require installing a PJRT plugin in the machine you are deploying it. It requires you to compile your own static PJRT plugin from XLA sources. Use go build --tags=pjrt_cpu_static or include import _ "github.com/gomlx/gomlx/backends/xla/cpu/static".

👥 Support

Discussion in the Slack channel #gomlx (you can join the slack server here).
Q&A and discussions
Issues
Random brainstorming on projects: just start a Q&A, and I'm happy to meet in discord somewhere or VC.
Google Groups: groups.google.com/g/gomlx-discuss

🛠️ + ⚙️ Installation

For most users, no installation is needed.

For XLA, it will by default auto-install the required XLA PJRT plugins (for CPU, GPU and TPUs; Linux and Macs) in the user's local lib directory ($HOME/.local/lib/go-xla in Linux and $HOME/Library/Application Support/go-xla in Mac). It can be disabled by setting GOMLX_NO_AUTO_INSTALL or programmatically by calling xla.EnableAutoInstall(false)).

If you want to manually pre-install for building production dockers, a specific version, or such custom setups, see github.com/gomlx/go-xla for details, there is a self-explanatory simple installer program.

If you want to use only a pure Go backend, simply do import _ "github.com/gomlx/gomlx/backends/simplego" and there is no need to install anything.

🐳 Pre-built Docker

The easiest to start playing with it, it's just pulling the docker image that includes GoMLX + JupyterLab + GoNB (a Go kernel for Jupyter) and Nvidia's CUDA runtime (for optional support of GPU) pre-installed -- it is ~5Gb to download.

From a directory you want to make visible in Jupyter, do:

For GPU support add the flag --gpus all to the docker run command bellow.

docker pull janpfeifer/gomlx_jupyterlab:latest
docker run -it --rm -p 8888:8888 -v "${PWD}":/home/jupyter/work janpfeifer/gomlx_jupyterlab:latest

It will display a URL starting with 127.0.0.1:8888 in the terminal (it will include a secret token needed) that you can open in your browser.

You can open and interact with the tutorial from there, it is included in the docker under the directory Projects/gomlx/examples/tutorial.

More details on the docker here.

It runs on Windows as well: Docker Desktop uses WSL2 under the hood.

🧭 Tutorial

See the tutorial here. It covers a bit of everything.

After that, look at the demos in the examples/ directory.

The library itself is well-documented (pls open issues if something is missing), and the code is not too hard to read. Godoc is available in pkg.go.dev.

Finally, feel free to ask questions: time allowing (when not at work), I'm always happy to help: I'm offten connected to Slack channel #gomlx; alternatively the groups.google.com/g/gomlx-discuss.

Inference & Productionization

Inference or serving a model is done currently by using the Go code used to create the model along with the checkpoint with the trained weights and hyperparameters used to train the model. In other words, it uses the same tools used for training.

It's straight forward for instance, to create a Docker with a pretrained model and serve it from there. Or include it in you own application.

For a simple example of how to do this and export a model inference as a library, see .../examples/cifar/classifer, and its use in the last cells of the Cifar-10 demo.

In the future we plan to also export models to ONNX or XLA's StableHLO, and one could use tools that serve those directly, without linking GoMLX -- it will save a little executable size.

🎯 Long-term Goals

Building and training models in Go -- as opposed to Python (or some other language) -- with focus on:
- Being easy(ier) to read and reason about, leading the user to a correct and transparent mental model of what is going on. Even if that means being more verbose when writing.
- Clean, separable APIs: individual APIs should be self-contained and decoupled where possible.
- Composability: Any component should be replaceable, so they can be customized and experimented with. That means sometimes more coding (there is not one magic train object that does everything), but it makes it clear what is happening, and to replace parts with custom versions.
- Up-to-date documentation: if the documentation is not there or if it's badly written, it's as if the code was not there either.
- Clear and actionable error reporting
To be a productive research and educational platform to experiment with new ML ideas and learn.
- Support mirrored training on multiple devices and various forms of distributed training (model and/or data parallelism), in particular to support for large language models and similarly large model training.
To be a robust and reliable platform for production. Some subgoals:
- Support modern accelerator hardware like TPUs and GPUs.
- Multiple backends beyond XLA, e.g: llamacpp, WebNN (with Wasm), pure Go version, etc.
- Import pre-trained models from Hugging Face Hub and allow fine-tuning -- ONNX versions already working for many models in onnx-gomlx.
- Compile models to binary as in C-libraries and/or WebAssembly, to be linked and consumed (inference) anywhere (any language).

🤔 FAQ

What are the environment variables used by GoMLX?
- GOMLX_BACKEND: defines the backend engine to use (if using backends.New()). The value is formatted as "<backend_name>[:<backend_config>]", with the config part being optional. Examples:
  - GOMLX_BACKEND=go: Use the SimpleGo backend, the pure Go implementation that is very portable but slow.
  - GOMLX_BACKEND="xla:cpu": Use XLA (the faster backend, only runs on Linux now) for CPU
  - GOMLX_BACKEND="xla:cuda": Use XLA for for Nvidia CUDA
  - GOMLX_BACKEND="xla:/path/to/my/pjrt_plugin.so": Use XLA with an arbitrary PJRT. PJRT is a plugin system for XLA to support different hardware. One can install PJRTs build for NVIDIA GPUs (there is an installation script for that), there is also one for ROCm (not tested by the author), for TPU (Google Cloud) and reports of PJRTs being built to even new accelerators (e.g.: TensTorrent XLA)
- PJRT_PLUGIN_LIBRARY_PATH: the underlying XLA backend uses this variable as an extra directory to search for plugin locations. It searches for the systems library paths ($LD_LIBRARY_PATH, /etc/ld.so.conf), the default /usr/local/lib/gomlx/pjrt and $PJRT_PLUGIN_LIBRARY_PATH if set.
- GOMLX_NO_AUTO_INSTALL: if set to 1, GoMLX will not automatically install PJRTs when running on a system without them.
- XLA_FLAGS: optional controls for XLA backend. It should be set to a semicolon (";") separated list of options. If you set to --help the backend will print out some help for all options. There is also a description on the page XLA Flags Guidance.
What backends to include when using GoMLX?
- The recommendation is to use import _ "github.com/gomlx/gomlx/backends/default" which will import xla (or the alias stablehlo) and go (SimpleGo) backends. If you add -tags=noxla to the compiler it won't include the XLA backend.
- import _ "github.com/gomlx/gomlx/backends/simplego" to include only go (no C++ dependencies)
- import _ "github.com/gomlx/gomlx/backends/xla" to import only XLA.

🤝 Collaborating

The project is looking forward to contributions for anyone interested. Many parts are not yet set in stone, so there is plenty of space for improvements and re-designs for those interested and with good experience in Go, Machine Learning, and APIs in general. See the TODO file for inspiration.

No governance guidelines have been established yet.

See the section Support above to get in touch (Slack channel or Google Groups)!

💖 Support the Project

If you find this project helpful, please consider supporting our work through GitHub Sponsors.

Your contribution helps us (currently mostly me) dedicate more time to maintenance and add new features for the entire GoMLX ecosystem.

It also helps us acquire access (buying or cloud) to hardware for more portability: e.g.: ROCm, Apple Metal (GPU), Multi-GPU/TPU, NVidia DGX Spark, Tenstorrent, etc.

🚀 Advanced Topics

💖 Thanks

⚖️ License

Copyright 2025 Jan Pfeifer

GoMLX is distributed under the terms of the Apache License Version 2.0. Unless it is explicitly stated otherwise, any contribution intentionally submitted for inclusion in this project shall be licensed under Apache License Version 2.0 without any additional terms or conditions.

Name		Name	Last commit message	Last commit date
Latest commit History 2,717 Commits
.github		.github
backends		backends
cmd/gomlx_checkpoints		cmd/gomlx_checkpoints
docker/jupyterlab		docker/jupyterlab
docs		docs
examples		examples
internal		internal
pkg		pkg
ui		ui
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.sccignore		.sccignore
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

GoMLX, an Accelerated ML and Math Framework

📖 About GoMLX

🗺️ Overview

👥 Support

🛠️ + ⚙️ Installation

🐳 Pre-built Docker

🧭 Tutorial

Inference & Productionization

🎯 Long-term Goals

🤔 FAQ

🤝 Collaborating

💖 Support the Project

🚀 Advanced Topics

💖 Thanks

⚖️ License

About

Uh oh!

Releases 52

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors 18

Uh oh!

Languages

Uh oh!

License

gomlx/gomlx

Folders and files

Latest commit

History

Repository files navigation

GoMLX, an Accelerated ML and Math Framework

📖 About GoMLX

🗺️ Overview

👥 Support

🛠️ + ⚙️ Installation

🐳 Pre-built Docker

🧭 Tutorial

Inference & Productionization

🎯 Long-term Goals

🤔 FAQ

🤝 Collaborating

💖 Support the Project

🚀 Advanced Topics

💖 Thanks

⚖️ License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 52

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors 18

Uh oh!

Languages

Packages