--- library_name: mlx license: gemma widget: - messages: - role: user content: How does the brain work? inference: parameters: max_new_tokens: 200 extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license base_model: google/gemma-1.1-2b-it tags: - mlx pipeline_tag: text-generation --- # mlx-community/gemma-1.1-2b-it-bf16 This model [mlx-community/gemma-1.1-2b-it-bf16](https://huggingface.co/mlx-community/gemma-1.1-2b-it-bf16) was converted to MLX format from [google/gemma-1.1-2b-it](https://huggingface.co/google/gemma-1.1-2b-it) using mlx-lm version **0.26.1**. ## Use with mlx ```bash pip install mlx-lm ``` ```python from mlx_lm import load, generate model, tokenizer = load("mlx-community/gemma-1.1-2b-it-bf16") prompt = "hello" if tokenizer.chat_template is not None: messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) response = generate(model, tokenizer, prompt=prompt, verbose=True) ```