Mistral Unveils Ministral 3B and 8B AI Models for Edge Computing and On-Device Use

October 17, 2024 at 7:04:41 AM

TL;DR Mistral introduces two new AI models: Ministral 3B and Ministral 8B. These models excel in knowledge, reasoning, and efficiency for on-device and edge use cases, supporting up to 128k context length. Ministral 8B features a sliding-window attention pattern for faster inference. They cater to local, privacy-first applications like on-device translation and autonomous robotics. Both models outperform peers in benchmarks and are available now.

Mistral Unveils Ministral 3B and 8B AI Models for Edge Computing and On-Device Use

Mistral has introduced two new AI models, Ministral 3B and Ministral 8B, designed for on-device computing and edge use cases. These models excel in knowledge, commonsense reasoning, function-calling, and efficiency within the sub-10B category. They support up to 128k context length (currently 32k on vLLM), with Ministral 8B featuring an interleaved sliding-window attention pattern for faster and memory-efficient inference.

Use Cases

Les Ministraux cater to local, privacy-first inference needs for applications such as:

On-device translation
Internet-less smart assistants
Local analytics
Autonomous robotics

They are also efficient intermediaries for function-calling in multi-step workflows, handling input parsing, task routing, and API calls with low latency and cost.

The performance of les Ministraux has been benchmarked across multiple tasks, consistently outperforming peers. The models were compared to Gemma 2 2B, Llama 3.2 3B, Llama 3.1 8B, and Mistral 7B.

Pretrained Models

Ministral 3B and 8B were compared to other models in multiple categories, showcasing superior performance.

Instruct Models

Ministral 3B and 8B Instruct models were compared to Gemma 2 2B, Llama 3.2 3B, Llama 3.1 8B, and Gemma 2 9B, demonstrating significant improvements.

Availability and Pricing

Both models are available immediately with the following pricing:

Ministral 8B: $0.1 / M tokens (input and output), available under Mistral Commercial License and Mistral Research License.
Ministral 3B: $0.04 / M tokens (input and output), available under Mistral Commercial License.

For self-deployed use, commercial licenses are available upon request, with assistance in lossless quantization for specific use cases. Model weights for Ministral 8B Instruct are available for research use, and both models will soon be available from cloud partners.

Mistral AI continues to push the boundaries of frontier models, with Ministral 3B already outperforming the previous Mistral 7B on most benchmarks. Feedback is encouraged as they continue to innovate.