Differential Privacy Finetuning for MLX

Per-sample differentially private SGD for LoRA fine-tuning on Apple Silicon, running on-device via MLX. The training loop clips every sample's gradient and adds calibrated Gaussian noise, so no single example leaves a fingerprint in the weights. Loss-threshold membership inference attacks on Qwen2.5-0.5B LoRA drop from AUC 0.66–0.93 without DP to 0.50–0.54 with DP across SST-2, IMDB, and PubMedQA, with utility gaps of 0.6–5.0 points. Drop-in wrapper: make_private_loss + DPOptimizer around any mlx-lm training loop, with automatic RDP ε accounting.