Fine-tune a language model with MLX
MLX is Apple's array framework, optimized for Apple Silicon. mlx-lm fine-tunes LLMs on M-series Macs with unified memory.
Prerequisites
- +Mac with Apple Silicon (M1+)
- +16GB+ unified memory (32GB recommended)
- +Python 3.9+
Step-by-Step
- 1
Install mlx-lm
mlx-lm is the CLI for downloading, fine-tuning, and serving MLX models.
pip install mlx-lm - 2
Convert a model to MLX format
Most HF models work after a one-time conversion. -q quantizes to 4-bit.
mlx_lm.convert --hf-path mistralai/Mistral-7B-Instruct-v0.3 -q - 3
Prepare your dataset
mlx-lm expects a folder with train.jsonl and valid.jsonl, each line containing chat messages or text.
ls data/ # data/train.jsonl data/valid.jsonl - 4
Run LoRA fine-tune
The lora subcommand handles everything. 600 iterations is a good first pass.
mlx_lm.lora --model mlx-community/Mistral-7B-Instruct-v0.3-4bit --train --data ./data --iters 600 --learning-rate 1e-4 - 5
Test the adapter
Generate against the base model with --adapter-path to load your fine-tuned weights at inference.
mlx_lm.generate --model mlx-community/Mistral-7B-Instruct-v0.3-4bit --adapter-path adapters --prompt 'Q: ...' - 6
Fuse and ship
Fusing merges the adapter into the base for portable distribution.
mlx_lm.fuse --model mlx-community/Mistral-7B-Instruct-v0.3-4bit --adapter-path adapters --save-path fused-model
Common Pitfalls
- !Running on Intel Mac. MLX is Apple Silicon only.
- !Underestimating memory pressure. Close everything when training a 7B.
- !Skipping --grad-checkpoint on tight memory budgets.
DevDigest Academy
Structured AI engineering courses with hands-on labs. Build production-ready apps faster.
What's Next
- ->Serve via mlx_lm.server for an OpenAI-compatible API.
- ->Try DPO via the mlx-examples repo for preference tuning.
