-
-
Notifications
You must be signed in to change notification settings - Fork 180
Problem when running some models with cuda #261
Copy link
Copy link
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Issue description
Models keep generating dummy result when running with cuda
Expected Behavior
Models stop generating dummy output like running as cpu or vulkan.
Actual Behavior
Models keep generating dummy result.
Steps to reproduce
I use this Qwen2 1.5B model download from here
Running with gpu is auto or cuda
const llama = await getLlama({gpu: 'cuda'})
My Environment
| Dependency | Version |
|---|---|
| Operating System | Windows 10 |
| CPU | AMD Ryzen 7 3700X |
| GPU | RTX4090, RTX3080 |
| Node.js version | v20.11.1 |
| Typescript version | 5.5.2 |
node-llama-cpp version |
3.0.0-beta.36 |
Additional Context
Here is example I running using https://github.com/withcatai/node-llama-cpp/releases/download/v3.0.0-beta.36/node-llama-cpp-electron-example.Windows.3.0.0-beta.36.x64.exe

These models run normally with 'vulkan', 'cpu', 'metal'
Relevant Features Used
- Metal support
- CUDA support
- Grammar
Are you willing to resolve this issue by submitting a Pull Request?
Yes, I have the time, but I don't know how to start. I would need guidance.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working