Is your feature request related to a problem? Please describe.
GPTQ currently does not support QuantizationStrategy.BLOCK. This is most likely because it predates the addition of block-wise quantization strategy to compressed-tensors/llm-compressor and its continued usage in DeepSeek models. I don't see any reason why it would be incompatible with GPTQ, and I'd like to see how it would GPTQ would perform vs. round-to-nearest on a DeepSeek3.2 NVFP4+FP8_BLOCK model
Describe the solution you'd like
Add case for block strategy to GPTQ algorithm here
Additional context
PR should include some basic benchmarks comparing GPTQ FP8_BLOCK to round-to-nearest, hopefully showing advantage
Is your feature request related to a problem? Please describe.
GPTQ currently does not support
QuantizationStrategy.BLOCK. This is most likely because it predates the addition of block-wise quantization strategy to compressed-tensors/llm-compressor and its continued usage in DeepSeek models. I don't see any reason why it would be incompatible with GPTQ, and I'd like to see how it would GPTQ would perform vs. round-to-nearest on a DeepSeek3.2 NVFP4+FP8_BLOCK modelDescribe the solution you'd like
Add case for block strategy to GPTQ algorithm here
Additional context
PR should include some basic benchmarks comparing GPTQ FP8_BLOCK to round-to-nearest, hopefully showing advantage