CA Topic

Training of Large Language Models (LLMs) by Indian Firms

February 26, 2026 en

Brief Context

Context Bengaluru-based startup Sarvam AI unveiled two indigenous Large Language Models (LLMs), underscoring India’s push for sovereign, multilingual, and compute-efficient AI amid global competition. Large Language Models (LLMs) A large language model (LLM) is a type of artificial intelligence (AI) algorithm that uses deep learning techniques and massively large data sets to understand, summarize, generate and predict new content. Deep learning involves the probabilistic analysis of unstructure

Source Content

Syllabus: GS3/ Science and Technology

Context

Bengaluru-based startup Sarvam AI unveiled two indigenous Large Language Models (LLMs), underscoring India’s push for sovereign, multilingual, and compute-efficient AI amid global competition.

Large Language Models (LLMs)

A large language model (LLM) is a type of artificial intelligence (AI) algorithm that uses deep learning techniques and massively large data sets to understand, summarize, generate and predict new content.
Deep learning involves the probabilistic analysis of unstructured data, which eventually enables the deep learning model to recognize distinctions between pieces of content without human intervention.
It helps to understand how characters, words, and sentences function together.

Indigenous LLM Ecosystem in India

Sarvam AI Models: Focus on efficiency, accuracy, and Indian language capabilities. Intended to be open-source, though broader public scrutiny is ongoing.
BharatGen, incubated at IIT Bombay, trained a multilingual 17-billion-parameter model for sectors like education and healthcare.
Gnani.ai launched compact speech and text-to-speech models.

How LLMs Are Trained?

GPU Clusters: LLM training requires massive computational power using clusters of Graphics Processing Units (GPUs). Thousands of GPUs operate simultaneously for weeks or months.
Data as the Core Input: Training relies on enormous datasets, often scraped from the Internet.
Model Parameters: Parameters represent the internal weights through which models learn patterns. Sarvam AI trained models with 35 billion and 105 billion parameters.
- Larger parameter counts improve capability but require more computation.

Key Training Methodologies Used

Data Curation: It focuses on collecting high-quality datasets in Indian languages.
- It includes government documents, literature, media, and synthetic data generation.
- It is critical for improving performance beyond English-centric AI systems.
Pre-Training: The models learn general language patterns by predicting the next token in large unlabelled datasets.
- This stage builds foundational reasoning and grammar capabilities.
Fine-Tuning: Models are adapted for specific tasks using curated datasets.
- Tools such as Hugging Face and LangChain support instruction tuning, classification, and domain adaptation.
Alignment/RLHF (Reinforcement Learning from Human Feedback): Human raters rank model outputs to teach it to be safer, more accurate, and better aligned with human intent, discouraging harmful or biased responses.

Challenges in Training LLMs in India

Limited Indian Language Data: Scarcity of high-quality datasets in Indian languages reduces model performance.
- Many systems rely on translation into English before processing, increasing token usage and latency. Suboptimal native performance affects adoption among non-English users.
High Capital Requirements: Training frontier models demands substantial financial investment. Startups often lack immediate commercial returns to justify such costs.
Infrastructure Constraints: Access to high-end computing facilities remains limited without government support.

IndiaAI Mission

The IndiaAI Mission is the flagship initiative to build a comprehensive, sovereign AI ecosystem for India.
It focuses on developing high-performance computer infrastructure, indigenous foundational models, and safe, ethical AI, under the vision of “Making AI in India and Making AI Work for India”.
India has achieved 38,000 GPUs, providing affordable access to world-class AI resources.
A GPU or Graphics Processing Unit is a powerful computer chip that helps machines think faster, process images, run AI programs, and handle complex tasks more efficiently than a regular processor.

Source: TH