Skip to content

Releases: ggml-org/llama.cpp

b8839

18 Apr 20:21
4f02d47

Choose a tag to compare

b8838

18 Apr 10:40
23b8cc4

Choose a tag to compare

b8837

18 Apr 08:56
59accc8

Choose a tag to compare

b8836

18 Apr 08:59
83d58e0

Choose a tag to compare

b8833

17 Apr 18:35
45cac7c

Choose a tag to compare

ggml-webgpu: fix compiler warnings and refactor FlashAttention encoding (#21052)

  • Update workflows to remove dependence on llvmpipe

  • Try setting Dawn_DIR

  • remove c++20 initializers

  • Move to proper guid

  • Try avoiding segfaults on vulkan backend process exit

  • Remove compiler warnings on parameter casting

  • Fix soft_max and update reg_tile accumulation to f32 for better precision

  • Refactor flash_attn a bit

  • remove c++20 initializers and format

  • Increase div precision for NVIDIA

  • revert div precision and comment out ggml-ci node for now

  • Formatting

  • Try debugging on a failing CI node

  • Revert "Try debugging on a failing CI node"

This reverts commit 1971e33.

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

b8832

17 Apr 18:19
b94050e

Choose a tag to compare

b8831

17 Apr 13:25
a279d0f

Choose a tag to compare

b8830

17 Apr 12:55
268d61e

Choose a tag to compare

b8829

17 Apr 12:25
6990e2f

Choose a tag to compare

libs : rename libcommon -> libllama-common (#21936)

  • cmake : allow libcommon to be shared

  • cmake : rename libcommon to libllama-common

  • cont : set -fPIC for httplib

  • cont : export all symbols

  • cont : fix build_info exports

  • libs : add libllama-common-base

  • log : add common_log_get_verbosity_thold()

macOS/iOS:

Linux:

Windows:

openEuler:

b8828

17 Apr 11:53
fcc7508

Choose a tag to compare