
A new entrant has stepped into the global AI arena, and this time the accent is unmistakably Indian. Bengaluru startup Sarvam AI says parts of its technology outperform ChatGPT and Google Gemini in specific benchmarks. The claim has triggered both excitement and skepticism, because beating frontier models is one thing, competing with them at scale is another. So what exactly did Sarvam build and does it truly threaten the global AI hierarchy?
What is Sarvam AI and why was it created?
Sarvam AI was founded in August 2023 by Vivek Raghavan and Pratyush Kumar. The company is part of India’s broader push to develop domestic artificial intelligence under the government’s IndiaAI Mission.
In 2024, the Indian cabinet approved roughly $1.21 billion in funding to develop indigenous large multimodal and foundational AI models. The goal is straightforward: technological sovereignty. Instead of relying entirely on foreign models trained mostly on Western data, India wants systems built for its languages, bureaucracy, and digital public infrastructure.
What makes the mission different
Most global models aim for universal capability. Sarvam focuses on national relevance.
That means optimizing for:
- Indian languages rather than English dominance
- government documents and forms
- voice interfaces for low literacy users
- regional scripts and handwriting
Think less chatbot assistant and more national infrastructure layer.
What did Sarvam actually release?
Sarvam has not released a single ChatGPT-style chatbot competitor yet. Instead, it launched specialized AI systems targeting practical tasks.
Sarvam Vision: document understanding AI
An Optical Character Recognition system designed for real-world paperwork, not clean PDFs.
The company claims performance of:
- 84.3% accuracy on olmOCR-Bench
- 93.28% on OmniDocBench v1.5
It says this surpassed Gemini 3 Pro, ChatGPT-based OCR workflows, and DeepSeek OCR v2 in those tests.
The key detail: these benchmarks evaluate messy inputs such as scanned forms, government records, stamps, and multilingual pages.
Bulbul V3: text-to-speech for Indian languages
A speech synthesis model supporting:
- around three dozen voices
- 11 Indian languages
- expansion planned to 22 languages
Sarvam says the system focuses on stability and pronunciation accuracy across mixed-language inputs, a frequent failure point in global speech AI.
Why benchmarks don’t automatically mean it beats ChatGPT
This is the critical nuance.
Sarvam did not claim its language model is generally smarter than ChatGPT. It claimed superiority in targeted tasks.
Narrow AI vs general AI
ChatGPT and Gemini are general reasoning systems. Sarvam’s tools are specialized.
Compare them like this:
- ChatGPT: a Swiss Army knife
- Sarvam Vision: a precision industrial scanner
If you want an essay, ChatGPT wins.
If you want to digitize a handwritten land record in Kannada, Sarvam might win.
That distinction explains how smaller models can outperform giant ones in certain evaluations.
Why local language AI is harder than it looks
India has over 20 official languages and hundreds of dialects. The complexity is not just vocabulary but structure.
Challenges include:
- mixed scripts in the same sentence
- phonetic spelling variations
- code switching between English and regional language
- limited high quality training data
Large Western models are trained mostly on internet text. Indian administrative data often exists in scanned files, photos, or handwritten archives. Specialized training becomes more valuable than scale alone.
This is exactly the niche Sarvam is targeting.
Why this matters beyond India
Regional AI development is becoming a global trend.
Countries increasingly want models trained on their legal systems, culture, and languages for reasons including:
- data sovereignty
- national security
- regulatory compliance
- economic independence
Europe, the Middle East, and Japan are pursuing similar efforts. Sarvam represents India’s entry into that geopolitical tech shift.
Industry reaction so far
Developers working with Indic languages have responded positively, particularly about speech and document processing accuracy.
Some founders say specialized regional tools solve real business problems that global AI labs often ignore because they target mass markets first.
That difference reflects economics more than capability. Silicon Valley optimizes for billions of users. Regional startups optimize for critical workflows.
Can Sarvam eventually compete with ChatGPT or Gemini?
Short answer: not yet in the same category.
Where Sarvam could compete
- government digitization
- call center automation in regional languages
- banking and identity verification
- education accessibility tools
Where global models still dominate
- reasoning and coding
- research assistance
- general conversation intelligence
- multimodal creativity
The realistic path is not replacement but layering. Countries may use global models for general intelligence and domestic models for infrastructure tasks.
The hardware challenge ahead
AI performance depends heavily on computing power.
India’s AI mission has begun building capacity, including tens of thousands of GPUs through public private partnerships. But frontier model training still requires massive clusters and long-term investment.
The question is no longer whether India can build AI. It is whether it can scale quickly enough to keep pace with trillion parameter systems.
TL;DR
- Sarvam AI built specialized models, not a full ChatGPT replacement yet
- Its OCR and speech systems outperform global models in Indic language tasks
- The project is part of India’s sovereign AI strategy
- It competes in practical infrastructure AI, not general intelligence
- The real race is about regional specialization vs global scale



