Font size:
Print
India’s First Sovereign LLM
Sarvam: The Bharatiya LLM
Context: The Government of India has appointed Bengaluru-based AI startup Sarvam to build the country’s first sovereign large language model (LLM).
More on News
- This initiative requires cross-sector collaboration—including academia, central and state governments, corporations, and the robust Indian IT sector.
- The goal: Create an indigenous AI foundation model tailored to India’s linguistic, cultural, and functional diversity.
- The journey aims to emulate and innovate upon global models like ChatGPT (OpenAI) and DeepSeek (China).
AI’s Evolution: Setting the Context
- The term Artificial Intelligence was coined in 1956 by John McCarthy of MIT.
- Over the decades, AI has witnessed multiple hype cycles, but steady progress began about 15 years ago.
- Key developments:
- Data warehouses emerged from raw databases.
- AI moved from descriptive analytics to predictive and prescriptive analytics.
- Algorithmic decision-making began transforming sectors like customer service, supply chain, weather forecasting, and traffic navigation.
The LLM Revolution: Generative AI Emerges
- In 2017, Google’s paper “Attention is All You Need” introduced the Transformer architecture, significantly advancing natural language processing and laying the groundwork for modern LLMs.
- The transformer’s attention mechanism enhanced the model’s ability to focus on relevant text parts, catalysing the generative AI revolution.
- While models like ChatGPT dominated the scene, China’s DeepSeek introduced more nimble and distilled reasoning capabilities.
India’s LLM Mission: Vision and Requirements
- The mission aims to create a foundational AI model that supports:
- Multiple Indian languages and dialects.
- Applications across villages, cities, corporations, and individuals.
- India’s LLM must combine:
- High computing and learning power akin to Nvidia-enabled ChatGPT.
- Efficient, reasoning-driven models similar to DeepSeek.
The Roadmap to Building India’s LLM: Six Critical Steps
- Developing the Architecture: At the core of the LLM will be a transformer-based architecture, optimised for natural language processing.
- It must support a broad spectrum of use cases—from agricultural and weather forecasting to complex city and national administrative functions. These models will need to handle billions of parameters and provide tailored solutions.
- Data Collection and Curation: India is a data-rich nation. From ancient manuscripts and books to websites and data libraries, the raw material is abundant. However, this data must be carefully curated. Tasks like removing duplication (“de-duping”), eliminating noise, and ensuring the relevance of data will be crucial before feeding it into the LLM.
- Fine-Tuning for Vertical and Horizontal Applications: The model should be capable of handling specific language inputs, translations, contextual text transformation, and outcome optimisation for different domains and geographies.
- Vertical domains may include healthcare, education, and agriculture, while horizontal functions could span communication, analytics, and decision support.
- Comprehensive Training: Training an LLM is resource-intensive, requiring immense computing power and energy. The training must be thorough, enabling accurate prediction of words and sentences, timely updates, and integration of new knowledge through token replacement. This will ensure the model stays current and accurate.
- User Community Preparation: The success of any digital system depends on its users. India must build adaptive learning modules that guide users in their preferred language and context. Instructional designers will be key to creating these systems, ensuring that user education evolves in parallel with model development.
- Production Deployment: After rigorous training and testing, the models must be deployed in real-world environments. Early versions must minimise hallucinations or inaccuracies, as user trust is crucial. Only then can the model gain widespread acceptance and use.
Government & Institutional Support
- The Government has committed to providing computational resources.
- Ties with GPU-as-a-service providers ensure no quality compromise in development.
- Collaboration with IIT Madras will support Sarvam with deep academic research.
- Opportunity to build a new generation of AI talent and products through youth participation.
The Realistic Vision
- India isn’t aiming to overtake the US or China in AI immediately.
- Instead, the focus is on building competent, Indian-context LLMs to serve national needs.
- The $300-billion Indian IT industry and groups like:
- NASSCOM (National Association of Software and Service Companies),
- iSPIRT (Indian Software Product Industry Roundtable) must be strategically engaged.
- At the same time, the entrepreneurial spirit of young Indians must be nurtured and unleashed.