Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade MLX framework, and refactor create_generator to use stream_generate #44

Merged
merged 7 commits into from
Nov 26, 2024

Conversation

neilmehta24
Copy link
Member

@neilmehta24 neilmehta24 commented Nov 25, 2024

MLX package upgrades

mlx==0.21.0
mlx-lm==0.20.1
mlx-vlm==0.1.1

Each of these brings new features and improvements to the engine.

MLX LM upgrade

  • Switch over to using the stream generate API instead of generate step. stream generate is a more stable API, and also includes a wired limit setter. This addresses the issues described in Set wired limit before starting generation #40
  • Start using the new mlx_lm sampler class, since passing temp min_p and other sampling params into generate_step is deprecated. We will use mlx_lm default sampler method, and pass in user provided sampling params.
  • Repetition penalty params are now required to be passed in through the logits processor, so make that refactor.

MLX VLM upgrade

  • This upgrade adds support for two new models, Florence 2 and Molmo.

  • Florence requires as an input into the language model, the token that was generated during the last evaluation. Use the mlx_lm custom sampling capability to store the most recently sampled token as part of vision_model_wrapper. Note that this model requires trusting remote code.
    Screenshot 2024-11-26 at 11 57 37 AM

  • Molmo required no special code refactors. Note that using the Molmo model may use a lot of memory, so I have been testing by resizing smaller than usual. Note that this model requires trusting remote code.
    Screenshot 2024-11-26 at 11 59 08 AM

generate.py and demo.py updates

  • Start requiring named arguments for create_generator. Assign defaults for all of the named arguments, so that the caller doesn't have to provide defaults for arguments that may not be informed about. The generate_args option is removed, and now all of the configurable parameters are part of the method signature. The generate_args dict for mlx_lm is built up during create_generator
  • Since we have moved to the stream_generate API, there are a couple of refactors for passing arguments to mlx_lm, and handling return values during generation. This PR Stop strings with stream generate #45 fixes stop string support, it isn't handled correctly in this PR

@@ -69,12 +72,15 @@ def process_prompt(
# disable `prefill_step_size` prompt pre-processing in mlx_lm::generate_step
generate_args["prefill_step_size"] = float("inf")

generate_step_input = self.model.input_ids[0]
generate_step_input = self.model.input_ids[None]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[None] is a more correct implementation. This will flatten the array, instead of just taking the first element

@neilmehta24 neilmehta24 marked this pull request as ready for review November 26, 2024 17:43
@@ -210,7 +289,7 @@ def _convert_to_pil(self, images_b64: List[str]):
PIL.Image.open(BytesIO(base64.b64decode(img))) for img in images_b64 or []
]

def _custom_resize(self, pil_images, max_size=(1000, 1000)):
def _custom_resize(self, pil_images, max_size=(512, 512)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I initially was testing to set this to 1000x1000, if I dropped it down to 512 models like qwen2vl would start mis-recognizing text from screenshots of my screen.

Why the change? If for certain models can we custom resize just for those models?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean to commit this change. I reverted this.

@neilmehta24 neilmehta24 requested a review from mattjcly November 26, 2024 22:01
@neilmehta24 neilmehta24 merged commit 1ccce42 into main Nov 26, 2024
@neilmehta24 neilmehta24 deleted the neil/update-mlx-packages branch November 26, 2024 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants