Fix graph for RWKV6Qwen2 #11445

MollySophia · 2025-01-27T09:04:37Z

The token-shifting part was not correctly done in the previous implementation. It wasn't copied back to k_cache after a decode. As a result, the model was always lerping towards zero when decoding. Prefill(and as a result, PPL evaluation) wasn't affected.

Somehow this mistake didn't affect much on text generation as well lol (maybe the large 32B model already got enough context information into the wkv state?). That's why the bug wasn't found previously.

Signed-off-by: Molly Sophia <[email protected]>

Fix token shift for RWKV6Qwen2

533b447

Signed-off-by: Molly Sophia <[email protected]>

Animaxx added a commit to Animaxx/llama.cpp that referenced this pull request Jan 28, 2025

https://github.com/ggerganov/llama.cpp/pull/11445

9e2634d

slaren approved these changes Jan 28, 2025

View reviewed changes

MollySophia merged commit 325afb3 into ggml-org:master Jan 29, 2025
45 checks passed

ggerganov mentioned this pull request Jan 29, 2025

llama : refactor llama_kv_cache, llama_context and llm_build_context #11213

Closed

21 tasks

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025

llama: fix missing k_cache store for rwkv6qwen2 (ggml-org#11445)

29a94ae

Signed-off-by: Molly Sophia <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix graph for RWKV6Qwen2 #11445

Fix graph for RWKV6Qwen2 #11445

Uh oh!

MollySophia commented Jan 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix graph for RWKV6Qwen2 #11445

Fix graph for RWKV6Qwen2 #11445

Uh oh!

Conversation

MollySophia commented Jan 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants