Home Artificial intelligence Zoho’s Sridhar Vembu backs Sarvam AI roadmap, says AI startup is on great trajectory after new model launch – Technology News
Artificial intelligence

Zoho’s Sridhar Vembu backs Sarvam AI roadmap, says AI startup is on great trajectory after new model launch – Technology News

Share


Sarvam AI, a completely indigenous AI company headquartered in Bengaluru, is in the headlines again. After the company released two open-source large language models—Sarvam 30B and Sarvam 105B—developed and trained in India.

The company said the models were built from scratch using compute resources from the IndiaAI mission. According to Sarvam AI, the development process covered all stages, including pre-training, supervised fine-tuning, and reinforcement learning, using datasets prepared in-house. The build-ins are better in the dresses.

Industry leaders believe such initiatives could help the country reduce its dependence on foreign AI technologies and create solutions tailored to Indian users and languages.

Sridhar Vembu highlights the importance of strong foundations

Reacting to the development, Zoho co-founder Sridhar Vembu emphasized that building strong technological foundations is essential for long-term innovation. He said that consistent research and development efforts are necessary, even if they initially seem slow or unglamorous.

According to Vembu, sustained “catch-up R&D” allows companies and researchers to gradually build expertise, eventually leading to new ideas and breakthroughs in the technology ecosystem. His remarks highlight the importance of continuous investment in AI research within India.

Two AI Models Architecture explained

Sarvam AI’s newly released models are based on the Mixture-of-Experts (MoE) transformer architecture, a design that improves efficiency by activating only a small portion of the model’s parameters for each task. This allows the system to deliver strong performance while using less computational power.
The Sarvam 30B model is designed for efficient deployment and real-time applications. Despite having 30 billion parameters, it activates only a small portion of them when generating responses, reducing computing costs.
The larger Sarvam 105B model is built for more complex tasks such as reasoning and advanced AI operations. It also uses specialized attention mechanisms that help it process longer conversations and detailed queries more efficiently.

A major highlight of the project is its focus on multilingual capabilities. The models were trained using a combination of web data, coding repositories, mathematics datasets, and specialized knowledge sources.

Sarvam AI also dedicated a significant part of its training resources to the ten most widely spoken Indian languages, aiming to improve AI performance for local users and regional applications.

Performance benchmarks and availability

According to the company, the Sarvam 105B model performs strongly on several AI benchmarks, including reasoning and knowledge-based tests. It reportedly achieved a 98.6 score on the Math500 benchmark and 90.6 on MMLU, indicating strong analytical capabilities.

Both models are available through Sarvam’s API and can also be downloaded from platforms like AI Kosh and Hugging Face under the Apache 2.0 open-source license, making them accessible for developers and researchers.

The launch of Sarvam’s models reflects India’s broader effort to build its own AI infrastructure. By creating locally trained models designed for the country’s linguistic diversity and large population, startups like Sarvam are helping India strengthen its position in the global AI landscape.





Source link

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *