Skip to content

Conversation

@JohannesGaessler
Copy link
Collaborator

The imatrix code implicitly assumes that src1 is contiguous when copying data from a backend to host memory. As a result the vector to which the data is written can end up being resized to a size that is smaller than the amount of data that ggml_backend_tensor_get writes, resulting in out-of-bounds writes.

This PR makes it so that the host buffer always has the exact same size as the amount of data that is being copied. Also, if src1 is not contiguous, then this is considered for calculating the byte addresses of matrix rows.

Unless I'm misunderstanding the code the cases ne12 > 1 and ne13 > 1 are also going to result in unexpected behavior but I don't know what the correct fix would be.

LOG_DBGV(2, "%s[%d]: %32s, %s, %5d x %5d, %d\n", __func__, m_last_call, wname.c_str(), ggml_op_name(t->op), (int)src1->ne[0], (int)src1->ne[1], (int)src1->type);
for (int row = 0; row < (int)src1->ne[1]; ++row) {
const float * x = data + row * src1->ne[0];
const float * x = (const float *) (data + row * src1->nb[1]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excuse my ignorance if I'm asking silly questions, but why nb[1] instead of nb[0]?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nb[0] is the byte offset when changing dimension 0 by 1, nb[1] is the byte offset when changing dimension 1 by 1.

@JohannesGaessler JohannesGaessler merged commit 3e959f0 into ggml-org:master May 3, 2025
95 of 96 checks passed
@CISC
Copy link
Collaborator

CISC commented May 12, 2025

@JohannesGaessler In order for you not to have to look at the above linked PR, the solution is to loop over each attention head and compute and store them consecutively, meaning that values/counts must be resized to head_dim * n_head.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants