Llama4 RoPE fix by danielhanchen · Pull Request #12889 · ggml-org/llama.cpp

danielhanchen · 2025-04-11T07:33:50Z

Llama4 Scout config.json changed RoPE scaling, so we need to remove the assert since it breaks on Llama 4

danielhanchen · 2025-04-11T07:34:02Z

ngxson

Nice, thanks!

ngxson · 2025-04-11T08:40:43Z

@ggerganov FYI, I write a small script here: https://gist.github.com/ngxson/6dec015080121d239caa668332fba3f8

It calculates the rope_factors using 2 methods: one is the current code in convert_hf_to_gguf and the other is the code copied from transformers. We verified that output from 2 methods are all matched up.

Below is the value: on the left, old rope_base (before this commit) and on the right, after that commit:

idx. before  after
0        1.0     1.0
1        1.0     1.0
2        1.0     1.0
3        1.0     1.0
4        1.0     1.0
5        1.0     1.0
6        1.0     1.0
7        1.0     1.0
8        1.0     1.0
9        1.0     1.0
10       1.0     1.0
11       1.0     1.0
12       1.0     1.0
13       1.0     1.0
14       1.0     1.0
15       1.0     1.0
16       1.0     1.0
17       1.0     1.0
18       1.0     1.0
19       1.0     1.0
20       1.0     1.0
21       1.0     1.0
22       1.0     1.0
23       1.0     1.0
24       1.0     1.0
25       1.0     1.0
26       1.0     1.0
27       1.0     1.0
28       1.0     1.0
29       1.0     1.0
30       1.0     1.0
31       1.0     1.0
32       1.0     1.0
33       1.0     1.0
34       1.0     1.0
35       1.0     16.0
36       1.0     16.0
37       1.0     16.0
38       1.0     16.0
39       1.0     16.0
40       1.0     16.0
41       1.143270492553711       16.0
42       1.3603488206863403      16.0
43       1.6280388832092285      16.0
44       1.9624499082565308      16.0
45       2.3870468139648438      16.0
46       2.9374001026153564      16.0
47       3.6701674461364746      16.0
48       4.681482315063477       16.0
49       6.148653984069824       16.0
50       8.0     16.0
51       8.0     16.0
52       8.0     16.0
53       8.0     16.0
54       8.0     16.0
55       8.0     16.0
56       8.0     16.0
57       8.0     16.0
58       8.0     16.0
59       8.0     16.0
60       8.0     16.0
61       8.0     16.0
62       8.0     16.0
63       8.0     16.0

For now, I have no idea now to set this after loading GGUF (as discussed via DM), but feel free to make suggestion!

ggerganov · 2025-04-11T11:53:21Z

Thanks. AFAIU the upstream models have been updated with a new RoPE config which technically would require re-converting existing GGUF models. I don't think there is an elegant way to avoid this conversion and do it seamlessly so that old GGUFs work with the new rope factors. It seems it will always be some non-trivial hack that will remain in the codebase forever. So I think it is better to just recommend conversion/re-download of the models. We can put a notice in the README in hot topics?

ngxson · 2025-04-11T20:02:48Z

Yes sounds ok to me, if it's too hacky then let's not do it. I think most people will use quantization from @unslothai or @bartowski1182 anyway (which was and will be updated very quickly), so probably don't need to add a notice.

bartowski1182 · 2025-04-11T20:07:15Z

Yeah I'll let it sit another week to make sure there's nothing else breaking and throw the reconverted model up

Llama4 RoPE fix

bd38def

ngxson approved these changes Apr 11, 2025

View reviewed changes

ngxson merged commit ec6c09d into ggml-org:master Apr 11, 2025
5 checks passed

danielhanchen deleted the patch-2 branch April 11, 2025 07:57

github-actions bot added the python python script changes label Apr 11, 2025

Nexesenex pushed a commit to Nexesenex/croco.cpp that referenced this pull request Apr 12, 2025

convert : Llama4 RoPE fix (ggml-org#12889)

f130bf5

colout pushed a commit to colout/llama.cpp that referenced this pull request Apr 21, 2025

convert : Llama4 RoPE fix (ggml-org#12889)

71d3e6e

timwu pushed a commit to timwu/llama.cpp that referenced this pull request Dec 20, 2025

convert : Llama4 RoPE fix (ggml-org#12889)

b0b0acd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Llama4 RoPE fix#12889

Llama4 RoPE fix#12889
ngxson merged 1 commit intoggml-org:masterfrom
danielhanchen:patch-2

danielhanchen commented Apr 11, 2025

Uh oh!

danielhanchen commented Apr 11, 2025

Uh oh!

ngxson left a comment

Uh oh!

Uh oh!

ngxson commented Apr 11, 2025

Uh oh!

ggerganov commented Apr 11, 2025 •

edited

Loading

Uh oh!

ngxson commented Apr 11, 2025 •

edited

Loading

Uh oh!

bartowski1182 commented Apr 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

danielhanchen commented Apr 11, 2025

Uh oh!

danielhanchen commented Apr 11, 2025

Uh oh!

ngxson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ngxson commented Apr 11, 2025

Uh oh!

ggerganov commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngxson commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bartowski1182 commented Apr 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ggerganov commented Apr 11, 2025 •

edited

Loading

ngxson commented Apr 11, 2025 •

edited

Loading