Harveen Singh Chadha @HarveenChadha profile

Harveen Singh Chadha

@HarveenChadha

Followers

3K

Following

4K

Statuses

2K

Joined July 2019

Don't wanna be here? Send us removal request.

Harveen Singh Chadha

@HarveenChadha

4 years

Open Source Alert: Very excited to announce we are open sourcing Vakyash, a speech recognition framework to democratize speech recognition in Indic Languages. Some key features: 1. End to end training and experimentation platform built on top of @facebookai Wav2Vec 2.0.

20

148

536

Harveen Singh Chadha

@HarveenChadha

3 days

India needs to setup at least 5 Tier-1 and 10 Tier-2 labs by end of this year if we are genuinely serious about AI

10

6

166

Harveen Singh Chadha

@HarveenChadha

3 days

Last month, I was trying to build a parser using Azure ADI, as GPT-4o struggled with tables and low-resource script handling. Then tried to get bounding box regions from ADI as input to GPT-4o, and it improved drastically. Some days later, gave a random finance report to notebookLM and was so surprised to see the parsing accuracy. Immediately, went and tried gemini-2.0-Flash-Exp, and after some prompt modifications, I was able to get near-perfect outputs. Not really surprised to see these results, in fact flash thinking would be much better IMO. Google is really stepping up their game, not just with quality but with affordability as well

0

9

Harveen Singh Chadha

@HarveenChadha

4 days

RT @danielhanchen: We managed to fit Llama 3.1 8B < 15GB with GRPO! Experience the R1 "aha moment" for free on Colab! Phi-4 14B also works…

0

288

0

Harveen Singh Chadha

@HarveenChadha

4 days

Copying code is something every dev does, there is nothing wrong with it infact open source promotes reusability(with proper attributes). However, ethical considerations arise when you build a business around it. The inference code provided by AI4Bharat is MIT license, the code in Krutrim Translate repo is copied and modified with changes to run for your model configuration and you top it up with Krutrim Community License without even folking the repo. Is it legal ? Maybe yes (don’t have much clarity), is it ethical ? I leave it upto you to decide. I wish the best for your future projects, we all want you to win !

7

12

271

Harveen Singh Chadha

@HarveenChadha

5 days

if you think this is one of case.. I just checked the commit history of Chitrarth repo, it is copied from haotian-liu/LLaVA again the license is changed very conveniently

5

17

362

Harveen Singh Chadha

@HarveenChadha

5 days

Trying krutrim translate today And seeing the inference scripts and usage instructions, I am sure that this work was done by very uninterested devs. I mean you just literally need to copy paste from indictrans2 repo 🤦‍♂️ The model works well though !

0

22

Harveen Singh Chadha

@HarveenChadha

6 days

Just tried it and 🤯

Amjad Masad

@amasad

6 days

Whatever you need… make an app for that. Now on your phone. For everyone. Free.

0

3

Harveen Singh Chadha

@HarveenChadha

6 days

Raising 1000 crores is not a joke with this portfolio of models. Just for comparison, sarvam’s valuation is 966 crores It’s a big day for Indian AI ecosystem. I wonder if AI4Bharat, with a similar portfolio, goes to raise, how much they can raise.

0

13

Harveen Singh Chadha

@HarveenChadha

6 days

@seyarkayarivu Please read “Feb 2024 released the model”, released where ?? And what is internal release ? By this logic openai-o3 was released before deepseek R1

2

0

4

Harveen Singh Chadha

@HarveenChadha

6 days

All models released by Krutrim today for the "open source" community come with the Krutrim Community License. This is comparable to what Llama has been using; while Llama 3.2 does not require a separate license if your MAU are <700M, Krutrim's limit is only 1M

2

3

19

Harveen Singh Chadha

@HarveenChadha

6 days

Even though Krutrim's vocab is almost double of that of sarvam, the fertility score and average token count is still higher than that of sarvam in hindi. Vocabulary size (Sarvam): 64128 Vocabulary size (Krutrim): 131072 Average token count (Sarvam): 33.98 Average token count (Krutrim): 44.90 Fertility Score (Sarvam): 1.61 Fertility Score (Krutrim): 2.13 Methodology: Running tokenizer on 100k random hindi sentences.

1

0

12

Harveen Singh Chadha

@HarveenChadha

6 days

Wow, krutrim just raised $230M

Bhavish Aggarwal

@bhash

6 days

Announcing the @Krutrim AI lab today! While we’ve been working on AI for a year, today we’re releasing our work to the open source community and also publishing a bunch of technical reports. Our focus is on developing AI for India - to make AI better on Indian languages, data scarcity, cultural context etc. Here’s a list of models we’re releasing: - Krutrim 2 and Krutrim 1 LLMs: While Krutrim 1 (India’s first LLM) was launched in Jan 24, it was a basic 7B model. We’re launching Krutrim 2 today as a much improved model. More here: - Chitrarth 1: India’s first Vision Language Model built on top of Krutrim 1 capable of understanding images and documents. More here: - Dhwani 1: India’s first Speech Language Model built on top of Krutrim 1 capable of tasks like Speech translations. More here: - Vyakhyarth 1: State of the art Indic Embedding model for use cases like Search and RAG. More here: - Krutrim Translate 1: State of the art text to text translation. More here: In addition, since there was no global benchmark for Indic performance, we’ve developed “BharatBench” and the technical report is here: We’ve also published a bunch of technical reports and papers here: Also announcing India’s first GB200 deployment in partnership with NVIDIA! Will be live by March and we will make it the largest supercomputer in India by end of year. We’re nowhere close to global benchmarks yet but have made good progress in 1 year. And by open sourcing our models, we hope the entire Indian AI community collaborates to create a world class Indian AI ecosystem. We’re still learning to walk before we can run, hopefully within this year! All our open source work here: Web: GitHub: Huggingface: Also, announcing an investment of ₹2,000 Cr today into Krutrim and a commitment of ₹10,000 Cr by next year!

2

0

14

Harveen Singh Chadha

@HarveenChadha

8 days

There is nothing like “open-source chinese AI” its like saying I am using open-source French AI (transformers) to infer on my model Open source is just open source,

Arjun*

@mxtaverse

8 days

hosting open-source Chinese AI on Indian servers and selling it at a price to nationalist crowd... ...sounds like the tech equivalent of importing from China, assembling here and slapping a Made In India sticker on it to get PLI benefits

2

3

37