Skip to content

Conversation

@hutm
Copy link
Collaborator

@hutm hutm commented Mar 7, 2025

added support for 2,3,8 bit kernels
8bit kernel has issues: need to confirm GPTQ quantized tensors shapes
2, 3 bits work flawlessly

@hutm hutm requested review from Qubitium and ZX-ModelCloud and removed request for ZX-ModelCloud March 7, 2025 05:46
@nbasyl nbasyl merged commit 1bc6541 into eora_final Mar 7, 2025
0 of 2 checks passed
@Qubitium Qubitium deleted the mkhadkevich/eora_final branch March 9, 2025 06:53
@Qubitium Qubitium restored the mkhadkevich/eora_final branch March 9, 2025 06:53
@Qubitium Qubitium deleted the mkhadkevich/eora_final branch March 12, 2025 07:37
Qubitium added a commit that referenced this pull request Mar 12, 2025
* eora release wip

* fix calibration dataset bug and eora inference batch size larger than 1 bug

* added 2,3,8 bit support for eora kernel (#1392)

* added 2,3,8 bit support for eora kernel

* added 2 and 3 bits to  eora sweep test, fixed contiguous()

* check compatible bit range before fused op

Signed-off-by: Qubitium <[email protected]>

* format

Signed-off-by: Qubitium <[email protected]>

---------

Signed-off-by: Qubitium <[email protected]>
Co-authored-by: Qubitium <[email protected]>

* .

* finish eora documentation

* finish eora documentation

* finish eora documentation

---------

Signed-off-by: Qubitium <[email protected]>
Co-authored-by: Maksim Khadkevich <[email protected]>
Co-authored-by: Qubitium <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants