Upgrade MLX framework, and refactor create_generator to use stream_generate #44

neilmehta24 · 2024-11-25T19:36:10Z

MLX package upgrades

mlx==0.21.0
mlx-lm==0.20.1
mlx-vlm==0.1.1

Each of these brings new features and improvements to the engine.

MLX LM upgrade

Switch over to using the stream generate API instead of generate step. stream generate is a more stable API, and also includes a wired limit setter. This addresses the issues described in Set wired limit before starting generation #40
Start using the new mlx_lm sampler class, since passing temp min_p and other sampling params into generate_step is deprecated. We will use mlx_lm default sampler method, and pass in user provided sampling params.
Repetition penalty params are now required to be passed in through the logits processor, so make that refactor.

MLX VLM upgrade

This upgrade adds support for two new models, Florence 2 and Molmo.
Florence requires as an input into the language model, the token that was generated during the last evaluation. Use the mlx_lm custom sampling capability to store the most recently sampled token as part of vision_model_wrapper. Note that this model requires trusting remote code.
Molmo required no special code refactors. Note that using the Molmo model may use a lot of memory, so I have been testing by resizing smaller than usual. Note that this model requires trusting remote code.

generate.py and demo.py updates

Start requiring named arguments for create_generator. Assign defaults for all of the named arguments, so that the caller doesn't have to provide defaults for arguments that may not be informed about. The generate_args option is removed, and now all of the configurable parameters are part of the method signature. The generate_args dict for mlx_lm is built up during create_generator
Since we have moved to the stream_generate API, there are a couple of refactors for passing arguments to mlx_lm, and handling return values during generation. This PR Stop strings with stream generate #45 fixes stop string support, it isn't handled correctly in this PR

…ils.stream_generate

neilmehta24 · 2024-11-26T17:36:12Z

mlx_engine/vision/vision_model_kit.py

@@ -69,12 +72,15 @@ def process_prompt(
        # disable `prefill_step_size` prompt pre-processing in mlx_lm::generate_step
        generate_args["prefill_step_size"] = float("inf")

-        generate_step_input = self.model.input_ids[0]
+        generate_step_input = self.model.input_ids[None]


[None] is a more correct implementation. This will flatten the array, instead of just taking the first element

mattjcly · 2024-11-26T21:39:33Z

mlx_engine/vision/vision_model_wrapper.py

@@ -210,7 +289,7 @@ def _convert_to_pil(self, images_b64: List[str]):
            PIL.Image.open(BytesIO(base64.b64decode(img))) for img in images_b64 or []
        ]

-    def _custom_resize(self, pil_images, max_size=(1000, 1000)):
+    def _custom_resize(self, pil_images, max_size=(512, 512)):


When I initially was testing to set this to 1000x1000, if I dropped it down to 512 models like qwen2vl would start mis-recognizing text from screenshots of my screen.

Why the change? If for certain models can we custom resize just for those models?

I didn't mean to commit this change. I reverted this.

neilmehta24 added 4 commits November 25, 2024 14:33

Upgrade MLX framework, and refactor create_generator to use mlx_lm.ut…

b97432b

…ils.stream_generate

Fix vision kit argument

215f919

working

2654200

small fixes

36546eb

neilmehta24 commented Nov 26, 2024

View reviewed changes

Improve comments and rename next_y

81b8680

neilmehta24 marked this pull request as ready for review November 26, 2024 17:43

neilmehta24 requested review from yagil and mattjcly November 26, 2024 17:43

Stop strings with stream generate (#45)

838b380

mattjcly reviewed Nov 26, 2024

View reviewed changes

Update vision_model_wrapper.py image resize defaults

e92692f

neilmehta24 requested a review from mattjcly November 26, 2024 22:01

mattjcly approved these changes Nov 26, 2024

View reviewed changes

neilmehta24 merged commit 1ccce42 into main Nov 26, 2024

neilmehta24 deleted the neil/update-mlx-packages branch November 26, 2024 22:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade MLX framework, and refactor create_generator to use stream_generate #44

Upgrade MLX framework, and refactor create_generator to use stream_generate #44

neilmehta24 commented Nov 25, 2024 •

edited

Loading

neilmehta24 Nov 26, 2024

mattjcly Nov 26, 2024

neilmehta24 Nov 26, 2024

Upgrade MLX framework, and refactor create_generator to use stream_generate #44

Upgrade MLX framework, and refactor create_generator to use stream_generate #44

Conversation

neilmehta24 commented Nov 25, 2024 • edited Loading

MLX package upgrades

MLX LM upgrade

MLX VLM upgrade

generate.py and demo.py updates

neilmehta24 Nov 26, 2024

Choose a reason for hiding this comment

mattjcly Nov 26, 2024

Choose a reason for hiding this comment

neilmehta24 Nov 26, 2024

Choose a reason for hiding this comment

neilmehta24 commented Nov 25, 2024 •

edited

Loading