3.18.0 (2026-03-15)

3.18.1 (2026-03-17)

Features

customize postinstall behavior (#582) (57bea3d) (documentation: Customizing postinstall Behavior)
experimental support for context KV cache type configurations (#582) (57bea3d) (documentation: LlamaContextOptions["experimentalKvCacheKeyType"])
support NVFP4 quants (#582) (57bea3d)

Shipped with llama.cpp release b8390

3.18.0 (2026-03-15)

Features

automatic checkpoints for models that need it (#573) (c641959)
QwenChatWrapper: Qwen 3.5 support (#573) (c641959)
inspect gpu command: detect and report missing prebuilt binary modules and custom npm registry (#573) (c641959)

Bug Fixes

resolveModelFile: deduplicate concurrent downloads (#570) (cc105b9)
correct Vulkan URL casing in documentation links (#568) (5a44506)
Qwen 3.5 memory estimation (#573) (c641959)
grammar use with HarmonyChatWrapper (#573) (c641959)
add mistral think segment detection (#573) (c641959)
compress excessively long segments from the current response on context shift instead of throwing an error (#573) (c641959)
default thinking budget to 75% of the context size to prevent low-quality responses (#573) (c641959)

Shipped with llama.cpp release b8352

3.17.1 (2026-02-28)

Bug Fixes

Electron template (#566) (8931402)

Shipped with llama.cpp release b8179

3.17.0 (2026-02-27)

Features

getLlama: build: "autoAttempt" (#564) (dda5ade) (documentation: LlamaOptions ["build"])
remove octokit dependency (#564) (dda5ade)

Bug Fixes

CLI: disable Direct I/O by default (#564) (dda5ade)
Bun segmentation fault on process exit with undisposed Llama instance (#564) (dda5ade)
detect glibc inside Nix (#564) (dda5ade)

Shipped with llama.cpp release b8169

3.16.2 (2026-02-21)

Bug Fixes

macOS 14 prebuilt binaries (#559) (6faa5ae)

Shipped with llama.cpp release b8121

3.16.1 (2026-02-20)

Bug Fixes

export missing types (#557) (498711c)

Shipped with llama.cpp release b8117

3.16.0 (2026-02-19)

Features

Exclude Top Choices (XTC) (#553) (57e8c22) (documentation: LLamaChatPromptOptions["xtc"])
DRY (Don't Repeat Yourself) repeat penalty (#553) (57e8c22) (documentation: LLamaChatPromptOptions["dryRepeatPenalty"])
Tiny Aya support (#553) (57e8c22)

Bug Fixes

adjust the default VRAM padding config to reserve enough memory for compute buffers (#553) (57e8c22)
support function call syntax with optional whitespace prefix (#553) (57e8c22)
change the default value of useDirectIo to false (#553) (57e8c22)
Vulkan device dedupe (#553) (57e8c22)

Shipped with llama.cpp release b8095

3.15.1 (2026-01-26)

Bug Fixes

adapt to llama.cpp changes (#547) (4baa480)
duplicate backend library files (#541) (f5123bf)

Shipped with llama.cpp release b7836

3.15.0 (2026-01-10)

Features

LlamaCompletion: stopOnAbortSignal (#538) (734693d) (documentation: LlamaCompletionGenerationOptions["stopOnAbortSignal"])
LlamaModel: useDirectIo (#538) (734693d) (documentation: LlamaModelOptions["useDirectIo"])

Bug Fixes

support new CUDA 13.1 archs (#538) (734693d)
build the prebuilt binaries with CUDA 13.1 instead of 13.0 (#538) (734693d)

Shipped with llama.cpp release b7698

3.14.5 (2025-12-10)

Bug Fixes

OIDC package publish (#531) (3d3cb97)

Shipped with llama.cpp release b7347

Uh oh!

Releases: withcatai/node-llama-cpp

v3.18.1

3.18.1 (2026-03-17)

Features

Uh oh!

v3.18.0

3.18.0 (2026-03-15)

Features

Bug Fixes

Uh oh!

v3.17.1

3.17.1 (2026-02-28)

Bug Fixes

Uh oh!

v3.17.0

3.17.0 (2026-02-27)

Features

Bug Fixes

Uh oh!

v3.16.2

3.16.2 (2026-02-21)

Bug Fixes

Uh oh!

v3.16.1

3.16.1 (2026-02-20)

Bug Fixes

Uh oh!

v3.16.0

3.16.0 (2026-02-19)

Features

Bug Fixes

Uh oh!

v3.15.1

3.15.1 (2026-01-26)

Bug Fixes

Uh oh!

v3.15.0

3.15.0 (2026-01-10)

Features

Bug Fixes

Uh oh!

v3.14.5

3.14.5 (2025-12-10)

Bug Fixes

Uh oh!