Skip to content

Commit 114647f

Browse files
Handle single-device models without hf_device_map after Transformers optimization (#2401)
* When `hf_device_map` does not exist, infer the `device_map` Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * cleanup Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * cleanup Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * Update optimum/gptq/quantizer.py Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> * cleanup Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * cleanup Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> * Fix device_map value to use param.device Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> --------- Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
1 parent 6cfe1ee commit 114647f

1 file changed

Lines changed: 7 additions & 1 deletion

File tree

optimum/gptq/quantizer.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -691,7 +691,13 @@ def pack_model(
691691
layers = get_layers(model)
692692
layers = {n: layers[n] for n in quantizers}
693693

694-
self.select_quant_linear(device_map=model.hf_device_map, pack=True)
694+
if hasattr(model, "hf_device_map"):
695+
device_map = model.hf_device_map
696+
else:
697+
# Transformers: skip accelerate hooks when device_map resolves to a single device
698+
device_map = {"": next(model.parameters()).device}
699+
700+
self.select_quant_linear(device_map=device_map, pack=True)
695701

696702
self._replace_by_quant_layers(model, quantizers)
697703
qlayers = get_layers(model, [self.quant_linear])

0 commit comments

Comments
 (0)