Added AVX512F macros to ggml.c #6088

amiralimi · 2024-03-15T17:37:09Z

Hi.
I added AVX512F macros to ggml.c. For commiting I ran pre-commit too. I wanted to add AVX512_FP16 support too, but I didn't have a hardware that supports it.

This change showed speed-up when running F16 and F32 models. Here are the results:

Results

avx512 - fp16
llama_print_timings:        50 runs   (  453.77 ms per token,     2.20 tokens per second)

avx - fp16
llama_print_timings:        50 runs   (  625.48 ms per token,     1.60 tokens per second)

avx512 - fp32
llama_print_timings:        50 runs   (  517.39 ms per token,     1.93 tokens per second)

avx - fp32
llama_print_timings:        50 runs   (  638.76 ms per token,     1.57 tokens per second)

I don't know if I need to add more things for this change (I'm new to open-source development).

ggerganov

Nice 👍

added AVX512F macros in ggml.

ab7012d

ggerganov approved these changes Mar 16, 2024

View reviewed changes

ggerganov merged commit c47cf41 into ggml-org:master Mar 16, 2024

amiralimi deleted the avx512 branch March 16, 2024 17:34

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 3, 2024

ggml : add AVX512F SIMD (ggml-org#6088)

ac48d78

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added AVX512F macros to ggml.c #6088

Added AVX512F macros to ggml.c #6088

Uh oh!

amiralimi commented Mar 15, 2024

Uh oh!

ggerganov left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Added AVX512F macros to ggml.c #6088

Added AVX512F macros to ggml.c #6088

Uh oh!

Conversation

amiralimi commented Mar 15, 2024

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants