Skip to content

Negative padding #214

@pavelgur

Description

@pavelgur

So I have this error during training. I had to reduce batch size to 1 to fix it. But I wonder what would be the correct way to resolve this?

INFO:main:Training model
{'Epoch': 0, 'Step': 0, 'Loss': '10.9777'}
{'Epoch': 0, 'Step': 10, 'Loss': '5.8438'}
0%| | 15/8317 [00:22<3:31:28, 1.53s/it, Epoch=0, Step=14, Loss=4.8478]
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "mlx_vlm/lora.py", line 178, in
main(args)
File "mlx_vlm/lora.py", line 98, in main
loss = trainer.train_step(
^^^^^^^^^^^^^^^^^^^
File "mlx_vlm/trainer/trainer.py", line 265, in train_step
loss, grads = loss_and_grad_fn(self.model, batch)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "mlx/nn/utils.py", line 35, in wrapped_value_grad_fn
value, grad = value_grad_fn(model.trainable_parameters(), *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "mlx/nn/utils.py", line 29, in inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "mlx_vlm/trainer/trainer.py", line 230, in loss_fn
outputs = model(input_ids, pixel_values, attention_mask, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "mlx_vlm/models/qwen2_vl/qwen2_vl.py", line 116, in call
input_embddings = self.get_input_embeddings(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "mlx_vlm/models/qwen2_vl/qwen2_vl.py", line 78, in get_input_embeddings
final_inputs_embeds = self._merge_input_ids_with_image_features(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "mlx_vlm/models/qwen2_vl/qwen2_vl.py", line 96, in _merge_input_ids_with_image_features
image_features = mx.pad(image_features, ((0, 0), (0, pad_size), (0, 0)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: Invalid high padding size (-60) passed to pad for axis 1. Padding sizes must be non-negative

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions