📷

Pixtral Large

Name: Pixtral Large
Price: API from $3.00/1M tokens USD
Author: Mistral AI

Mistral AI · 2025

Mistral's vision-language model with 124B parameters for image understanding and generation.

Visit Website

Quick Facts

Parameters

124B

Context Window

128K tokens

Modalities

text, image

Open Source

Pricing

API from $3.00/1M tokens

Released

2025

Developer

Mistral AI

About

Pixtral Large is Mistral AI's flagship vision-language model with 124 billion parameters, capable of both understanding and generating content based on visual inputs. It represents Mistral's entry into the multimodal AI space, combining their expertise in efficient language modeling with specialized vision encoders for competitive image understanding. Pixtral Large excels at image captioning, visual question answering, document understanding, chart and diagram analysis, and multimodal reasoning — all while maintaining Mistral's characteristic strength in European language contexts. What distinguishes Pixtral Large from other vision-language models is its European language advantage: it understands not just English visual content but also French, German, Spanish, Italian, and other European language text in images with native-level accuracy. This makes it uniquely valuable for processing multilingual documents, European signage and forms, and any visual content where language diversity matters. The 128K token context window allows processing of lengthy documents with embedded images or charts. Available through Mistral's API platform at competitive pricing (from USD 3.00 per 1M tokens). For European companies needing document digitization across multiple languages, businesses processing multilingual visual content, and developers building vision-language applications for European markets, Pixtral Large offers the best-in-class multimodal performance specifically optimized for European language contexts. Compared to GPT-4V or Qwen-VL-Max, Pixtral Large's advantage is its European language foundation; its image understanding quality is competitive but not market-leading. The main limitations are no native image generation capability, a smaller ecosystem than OpenAI's vision models, and API-only availability without open-weight options.

Strengths

+124B parameters for strong vision-language performance
+European language advantage in multimodal context
+Excellent document and chart understanding
+Competitive pricing vs GPT-4V alternatives

Weaknesses

−No native image generation capability
−Smaller ecosystem than OpenAI vision models
−Limited third-party integrations

Best For

Multimodal document analysis and digitization

European multilingual vision-language applications

Chart, diagram, and technical drawing understanding

Visual Q&A in enterprise contexts

Pricing

API

From $3.00/1M tokens

Vision-language
128K context
Document understanding
Chart analysis

Technical Specs

Parameters

124B

Context Window

128K tokens

Modalities

text, image

Languages

EnglishFrenchGermanSpanishItalian+3

Open Source

Developer

Mistral AI

Released: 2025

API Docs

Share this article

Related Models

🎨

Midjourney V7

Midjourney Inc.

The highest-quality AI image generator with cinematic output and advanced style control.

🖼️

DALL-E 3

OpenAI

OpenAI's image generator with superior prompt adherence, integrated into ChatGPT.

💥

FLUX.1 Pro

Black Forest Labs

Open-weight image generation model with 12B parameters, rivaling Midjourney quality.

🔥

Adobe Firefly 3

Adobe Inc.

Adobe's generative AI for professional design, integrated with Creative Cloud tools.