Problem when running some models with cuda

### Issue description

Models keep generating dummy result when running with cuda

### Expected Behavior

Models stop generating dummy output like running as `cpu` or `vulkan`.

### Actual Behavior

Models keep generating dummy result.

### Steps to reproduce

I use this Qwen2 1.5B model download from [here ](https://huggingface.co/MaziyarPanahi/Qwen2-1.5B-Instruct-GGUF/blob/main/Qwen2-1.5B-Instruct.Q4_K_M.gguf) 
Running with `gpu` is `auto` or `cuda` 
```
const llama = await getLlama({gpu: 'cuda'})
```

### My Environment

| Dependency               | Version             |
| ---                      | ---                 |
| Operating System         | Windows 10                    |
| CPU                      | AMD Ryzen 7 3700X |
| GPU                      | RTX4090, RTX3080 |
| Node.js version          | v20.11.1             |
| Typescript version       | 5.5.2             |
| `node-llama-cpp` version | 3.0.0-beta.36             |


### Additional Context

Here is example I running using https://github.com/withcatai/node-llama-cpp/releases/download/v3.0.0-beta.36/node-llama-cpp-electron-example.Windows.3.0.0-beta.36.x64.exe
![Screenshot 2024-07-01 165036](https://github.com/withcatai/node-llama-cpp/assets/39697900/aca8f980-ddab-4738-b7f8-4bf5ec93d4ee)

These models run normally with 'vulkan', 'cpu', 'metal'

### Relevant Features Used

- [ ] Metal support
- [X] CUDA support
- [ ] Grammar

### Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, but I don't know how to start. I would need guidance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Problem when running some models with cuda #261

Issue description

Expected Behavior

Actual Behavior

Steps to reproduce

My Environment

Additional Context

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dependency	Version
Operating System	Windows 10
CPU	AMD Ryzen 7 3700X
GPU	RTX4090, RTX3080
Node.js version	v20.11.1
Typescript version	5.5.2
`node-llama-cpp` version	3.0.0-beta.36

Uh oh!

Problem when running some models with cuda #261

Description

Issue description

Expected Behavior

Actual Behavior

Steps to reproduce

My Environment

Additional Context

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions