Mistral has introduced two new AI models, Ministral 3B and Ministral 8B, designed for on-device computing and edge use cases. These models excel in knowledge, commonsense reasoning, function-calling, and efficiency within the sub-10B category. They support up to 128k context length (currently 32k on vLLM), with Ministral 8B featuring an interleaved sliding-window attention pattern for faster and memory-efficient inference.
Use Cases
Les Ministraux cater to local, privacy-first inference needs for applications such as:
- On-device translation
- Internet-less smart assistants
- Local analytics
- Autonomous robotics
They are also efficient intermediaries for function-calling in multi-step workflows, handling input parsing, task routing, and API calls with low latency and cost.
The performance of les Ministraux has been benchmarked across multiple tasks, consistently outperforming peers. The models were compared to Gemma 2 2B, Llama 3.2 3B, Llama 3.1 8B, and Mistral 7B.
Pretrained Models
- Ministral 3B and 8B were compared to other models in multiple categories, showcasing superior performance.
Instruct Models
- Ministral 3B and 8B Instruct models were compared to Gemma 2 2B, Llama 3.2 3B, Llama 3.1 8B, and Gemma 2 9B, demonstrating significant improvements.
Availability and Pricing
Both models are available immediately with the following pricing:
- Ministral 8B: $0.1 / M tokens (input and output), available under Mistral Commercial License and Mistral Research License.
- Ministral 3B: $0.04 / M tokens (input and output), available under Mistral Commercial License.
For self-deployed use, commercial licenses are available upon request, with assistance in lossless quantization for specific use cases. Model weights for Ministral 8B Instruct are available for research use, and both models will soon be available from cloud partners.
Mistral AI continues to push the boundaries of frontier models, with Ministral 3B already outperforming the previous Mistral 7B on most benchmarks. Feedback is encouraged as they continue to innovate.