Image for BharatGen bets on small models to hit big AI goalsReuters
Government-backed AI initiative BharatGen will develop a suite of generic and domain specific small language models (SLMs) with sectoral applications, to build up a sovereign resource base for Indian firms as well as unlock enterprise value, its executive vice president Rishi Bal told ET.

An academic consortium based out of IIT Bombay, BharatGen last week secured the largest government support under the India AI Mission — Rs 988.6 crore. Seven other entities were also selected under the mission to receive incentives for building foundational AI models. The centre is banking on BharatGen to create foundational large language models (LLMs) and multimodal models, and make a mark globally as India's sovereign AI. But smaller models also remain a key goal.

"We will work with Nasscom and push for a sovereign AI stack that companies looking for domain specific, effective small language models can access. We have already created domain specific SLMs, and will build on that," Ganesh Ramakrishnan, principal investigator at BharatGen, said.


However, the road to developing SLMs rests on BharatGen's planned LLM with up to one trillion parameters, one of the key deliverables after the latest funding. A parameter is a variable that AI models learn from training data, and this ultimately determines the output based on the input data.

"The ability to get up to that size (one trillion) will help in creating better, smaller, and more distilled models. We have already created SLMs for agriculture, ayurveda, legal, and finance sectors. A range of SLMs with different capabilities will be a key part of unlocking our enterprise potential," Bal explained.

"There is a trajectory of small to large models, and vice versa. Some intermediate models will be better achieved when distilled from large models’’, Ramakrishnan said.

BharatGen is India's first indigenously developed multimodal LLM project for Indian languages, and is supported by the Department of Science & Technology under the National Mission on Interdisciplinary Cyber-Physical Systems. Multimodal models are designed to process, integrate, and analyse multiple 'modalities' of data simultaneously, such as text, images, audio, and video. The initiative is developing inclusive and efficient AI across 22 Indian languages.

BharatGen's under-development LLMs will build on Param-1, a bilingual LLM with 2.9 billion parameters that it launched in May. This was pretrained on high-quality data from diverse Indian domains, across five trillion tokens in English and Hindi. A token is the basic building block of an LLM, being the smallest unit of data that an AI model processes, especially in natural language processing (NLP) and generative AI.