Skip to content

Misc. bug: "--models-preset config.ini" doesn't accept the --no-mmap argument #18038

@lakano

Description

@lakano

Name and Version

llama-server --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 ROCm devices:
Device 0: AMD Instinct MI50/MI60, gfx906:sramecc+:xnack- (0x906), VMM: no, Wave Size: 64
Device 1: AMD Instinct MI50/MI60, gfx906:sramecc+:xnack- (0x906), VMM: no, Wave Size: 64
version: 7403 (5c8a717)
built with GNU 13.3.0 for Linux x86_64

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

llama-server --models-preset config.ini 
(same problem with: llama-server --models-preset config.ini --no-mmap )

Problem description & steps to reproduce

Content of my config.ini :

[gpt-oss-120b]
model = /path/to/models/gpt-oss-120b/ggml-org_gpt-oss-120b-GGUF_gpt-oss-120b-mxfp4-00001-of-00003.gguf
ctx-size = 32768
temp = 1.0
top-p = 1.0
top-k = 0
min-p = 0
gpu-layers = -1
split-mode = layer
tensor-split = 0.5,0.5
main-gpu = 0
numa = isolate
reasoning-format = none
flash-attn = on
jinja = 1
no-mmap = 1
chat-template-kwargs = {"reasoning_effort": "high"}

Problem, the "no-mmap = 1" isn't converted to --no-mmap
For boolean arguments, the documentation explain we can put on/off, 1/0 or true/false.

But it's not respected, and there is also a « --mmap » added:


srv          load: spawning server instance with name=gpt-oss-120b on port 57733
srv          load: spawning server instance with args:
srv          load:   /path/to/llama-server
srv          load:   --chat-template-kwargs
srv          load:   {"reasoning_effort": "high"}
srv          load:   --host
srv          load:   127.0.0.1
srv          load:   --jinja
srv          load:   --min-p
srv          load:   0
srv          load:   --mmap
srv          load:   --numa
srv          load:   isolate
srv          load:   --port
srv          load:   57733
srv          load:   --reasoning-format
srv          load:   none
srv          load:   --temp
srv          load:   1.0
srv          load:   --top-k
srv          load:   0
srv          load:   --top-p
srv          load:   1.0
srv          load:   --alias
srv          load:   gpt-oss-120b
srv          load:   --ctx-size
srv          load:   32768
srv          load:   --flash-attn
srv          load:   on
srv          load:   --model
srv          load:   /path/to/models/gpt-oss-120b/ggml-org_gpt-oss-120b-GGUF_gpt-oss-120b-mxfp4-00001-of-00003.gguf
srv          load:   --main-gpu
srv          load:   0
srv          load:   --n-gpu-layers
srv          load:   -1
srv          load:   --split-mode
srv          load:   layer
srv          load:   --tensor-split
srv          load:   0.5,0.5

I haven't found any solution to transmit this --no-mmap parameter from the config.ini
My hardware doesn't support the --mmap paramater, so it's impossible to uses config.ini at all right now.

First Bad Commit

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions