lovish Profile Banner
lovish Profile
lovish

@louvishh

Followers
633
Following
693
Media
33
Statuses
180

phding @ucl and @aiatmeta (llama team). mostly random tweets here.

london
Joined July 2021
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@louvishh
lovish
1 month
405b is out! working on llama 3 has been a truly rewarding experience and i'm super grateful to all my teammates! i'm excited to see how the llama models will be used by the community! p.s. - we wrote a paper and not just a tech report 😛
Tweet media one
12
19
128
@louvishh
lovish
2 years
exactly 365 days ago, something snapped in my mind, and i finally bought a gym membership. happy then, happier now 🙂
Tweet media one
43
32
1K
@louvishh
lovish
2 months
🚨 New Paper 🚨 Evaluations can have a lot of variance, throwing off model comparisons especially during pre-training. In our latest work, “Quantifying Variance in Evaluation Benchmarks”, we explore this phenomenon in depth. A thread [1/n]
Tweet media one
3
23
152
@louvishh
lovish
11 months
life update ✨: i’ve moved to london and started my phd at @ucl_nlp and @AIatMeta ! looking forward to collaborating with folks in research and making new friends. if you’re around in the uk and would like to say hi, please feel free to reach out!
Tweet media one
Tweet media two
5
1
67
@louvishh
lovish
2 years
celebrating iclr acceptance with a swim
Tweet media one
Tweet media two
17
2
58
@louvishh
lovish
2 years
just remove sugar, cook in olive oil, bake/grill instead of fry. and of course, you can have cheat days too. and most importantly, don’t worry if there are off days. i had quite a few relapses where i was eating anything i wanted to. just pick yourself up and start again 😁
5
0
52
@louvishh
lovish
2 years
another one of my schoolmates is getting married, and here i am, still contemplating if i should apply for a phd.
5
1
35
@louvishh
lovish
2 years
and diet is also a very important part of this process. i tried a bunch of things - keto/intermittent/calorie deficit etc. i found that you don’t have to follow anything extreme and calorie deficit diets are just as good, and in fact more sustainable for long term.
1
0
36
@louvishh
lovish
2 years
since a lot of you’ve been asking, for the workouts, i focused on both strength training and cardio. cardio is important for burning calories and weights for building muscle. my average workout routine is 30 mins cardio + 35-45 mins strength training for 5/6 days a week.
1
0
34
@louvishh
lovish
2 years
working a full time job and still being handed 500 rupaye ka note forcefully by relatives whenever you meet them has to be the most desi thing ever ffs.
4
2
32
@louvishh
lovish
29 days
@giffmana we use our internal evals repository to run all the evals. we did release the inputs/outputs/metrics using our repo for most evaluation tasks (including mmlu) here:
1
1
28
@louvishh
lovish
1 year
in kigali, rwanda for #ICLR2023 ! hit me up if you would like to talk about nlp, large language models, and optimization! also stop by our poster:
Tweet media one
0
0
22
@louvishh
lovish
1 year
can never do this in blr. 🚴🏼🚴🏼
Tweet media one
0
0
22
@louvishh
lovish
2 years
used to blast loud music in my room during wfh. have to wear earphones in the office like some civilised guy now.
2
0
21
@louvishh
lovish
1 year
after a lot of socializing, information overload, and llm discussions at iclr, it’s time for a solo trip in cape town!
Tweet media one
Tweet media two
Tweet media three
2
0
21
@louvishh
lovish
1 year
officially a xoogler now!
Tweet media one
Tweet media two
0
0
21
@louvishh
lovish
1 month
@_xjdr @AIatMeta ngl, the paper writing was too much fun!
1
0
18
@louvishh
lovish
4 months
fixed the fixed fix for llama3
Tweet media one
@armandjoulin
Armand Joulin
4 months
Fixed the fix.
Tweet media one
6
9
115
0
3
17
@louvishh
lovish
6 months
just got my bike stolen. i guess i am a true londoner now 😭
2
1
16
@louvishh
lovish
2 months
in the bay area this week, who should I meet?
Tweet media one
2
0
15
@louvishh
lovish
2 years
saw an indian couple in barcelona eating rotis with fork and knife. i’ve seen everything now.
1
0
14
@louvishh
lovish
4 months
a preview of things to come from all things llama 😎 glad to be working with such an amazing team!
@ml_perception
Mike Lewis
4 months
Excited to share a preview of Llama3, including the release of an 8B and 70B (82 MMLU, should be the best open weights model!), and preliminary results for a 405B model (still training, but already competitive with GPT4). Lots more still to come...
18
97
507
1
0
14
@louvishh
lovish
3 months
neurips deadline over, time for some yolo travel and pre-training runs 🙃
1
0
14
@louvishh
lovish
2 years
deleted instagram a month ago, and now i’ve an urge to shitpost/sadpost on the bird app. the thought of senior folks at work seeing those posts is holding me back 😬😂
2
1
12
@louvishh
lovish
1 year
i should save all the random papers, code, and memes in my bookmarks before this site blows up.
0
1
10
@louvishh
lovish
1 month
here's the paper link:
1
0
10
@louvishh
lovish
1 year
the funny thing is that sequoia india is going to give $10 million to ten teams of three engineers each, instead of just giving $100 million to a single company to train foundation models.
1
0
10
@louvishh
lovish
2 years
perfect sunday morning in blr
Tweet media one
Tweet media two
Tweet media three
1
0
10
@louvishh
lovish
2 months
MMLU performance is at a chance level even after training for 210B tokens for the standard formulation (the model is presented with all the choices and asked to predict the most relevant choice). But MMLU-Cloze gives a better signal during the early stages of the training. [5/n]
Tweet media one
Tweet media two
1
1
9
@louvishh
lovish
1 year
the amount of weird looks you get when you ask for a table for one in a restaurant is crazyyy. bro why you judging me, i’m just enjoying my solo trip.
2
0
9
@louvishh
lovish
1 month
@NamanGoyal21 if we extrapolate, looks like it's gonna be 131k gpus 💀
0
0
9
@louvishh
lovish
2 years
palm, say-can, and dall-e 2. going through the ai research updates this week feels like …
Tweet media one
1
0
9
@louvishh
lovish
2 years
anxiety hits on a different level when you see a senior author’s cursor in the paper section you are writing on overleaf.
0
0
8
@louvishh
lovish
2 years
my family is having bread pakoras with chai for their sunday breakfast and i’ve to be content with my egg white smoothie and oats!? .__. this healthy shit is hard fr 😭
3
0
8
@louvishh
lovish
1 month
what size 👀
@polynoamial
Noam Brown
1 month
GPT-4o mini is out! It's best in class for its size, especially at reasoning.
Tweet media one
16
24
245
0
0
8
@louvishh
lovish
2 years
punjabis my age have only seen either badal or captain as the cm. it’s so refreshing to see bhagwant mann this time around. he used to do comedy in his past life and regularly visited my school in sangrur during annual events and performed standup comedy! fun times 😂
2
0
7
@louvishh
lovish
2 years
@ImZackAdams @OtherGu83695592 +1 to that. just focus on yourself and have fun.
1
0
6
@louvishh
lovish
2 months
We prune samples with low item discrimination and while we find modest improvements in both standard error (a decrease) and monotonicity (an increase), the drift in the estimated accuracy is mildly concerning. [7/n]
Tweet media one
1
0
7
@louvishh
lovish
2 months
We track various metrics - seed mean/seed variance/95% CI/monotonicity for the 7B seed runs on both discrete and continuous metrics. We find that tracking continuous metrics is important as they have higher monotonicity and give higher signal compared to discrete metrics. [4/n]
Tweet media one
Tweet media two
1
0
7
@louvishh
lovish
4 months
@deliprao @AIatMeta i would say the "doesn't challenge the frontier" is not entirely correct. yes, we don't release the 400B+ model for now but it's already on-par with opus/gpt-4 while it's still under training.
1
0
7
@louvishh
lovish
9 months
no offense to gemini, but what's this hokum with chain of thought MMLU, just report the 5-shot numbers lol
2
1
7
@louvishh
lovish
5 months
was planning to go for therapy today but the new mixtral is not gonna benchmark itself 🙃
@MistralAI
Mistral AI
5 months
magnet:?xt=urn:btih:9238b09245d0d8cd915be09927769d5f7584c1c9&dn=mixtral-8x22b&tr=udp%3A%2F%%3A1337%2Fannounce&tr=http%3A%2F%%3A1337%2Fannounce
274
826
6K
0
0
7
@louvishh
lovish
1 year
when is this madness gonna end?
Tweet media one
1
0
6
@louvishh
lovish
2 months
We try to reduce variance by taking inspiration from item analysis, where we define item difficulty (average score across models) and item discrimination (correlation b/w models’ score on a given point and models’ overall score) for each sample in the benchmark. [6/n]
1
0
6
@louvishh
lovish
2 months
Moreover, using tinyBenchmarks as a cheap evaluation measure during early stages of pre-training does not give an informative signal on 3/3 of the benchmarks we examined due to increased variance. [9/n]
Tweet media one
1
0
6
@louvishh
lovish
1 year
quoting from : "we offer no explanation as to why these architectures seem to work; we attribute their success, as all else, to divine benevolence." ... someday, i hope to have read all of noam shazeer's research.
1
1
5
@louvishh
lovish
2 months
We also analyse item response theory, as used by Polo et al. 2024 (tinyBenchmarks). We find that simply using the performance on the 100 samples selected by tinyBenchmarks can lead to large deviations in the mean (7% for ARC-C) and high variance. [8/n]
Tweet media one
Tweet media two
1
0
5
@louvishh
lovish
3 years
i always use the same first word for every wordle. i guess this was bound to happen someday 😂🤷🏻‍♂️ wordle 206 1/6 🟩🟩🟩🟩🟩
2
0
5
@louvishh
lovish
2 years
dealing with uber drivers in blr
Tweet media one
0
0
5
@louvishh
lovish
2 years
just because ap dhillon dropped a new song doesn’t mean you have to put it in every one of your instagram stories.
1
0
4
@louvishh
lovish
2 months
Benchmark datasets are used for establishing progress with frontier AI models. Any major model release is accompanied by a slew of scores on these benchmarks. Yet, despite their importance, benchmark scores are often regarded as a one-dimensional number. [2/n]
Tweet media one
1
0
5
@louvishh
lovish
2 months
We present a deep dive into variance in benchmark scores across 13 popular benchmarks using over 280 models, including fully trained public models as well as a set of 7B models that we trained from scratch, differing only in their initialisation random seed. [3/n]
Tweet media one
1
0
4
@louvishh
lovish
9 months
@srush_nlp @sharakelyan probably because the same shot performance did not match gpt4. also the mmlu reporting is interesting (appendix 9.1): 5-shot mmlu did not match gpt4, and i guess that's why they developed the chain of thought based evaluation.
0
0
1
@louvishh
lovish
1 year
day 1 of iclr and got this notification
Tweet media one
0
0
4
@louvishh
lovish
1 year
p.s. - i have met some incredible startup folks but none of them are working in ai. the current ai startup scene is dominated by people who all migrated from crypto/web3. complete chaos.
1
0
3
@louvishh
lovish
2 years
’twas a good weekend ✨
Tweet media one
Tweet media two
1
0
4
@louvishh
lovish
4 months
@Teknium1 sonnet is dead 🙂
0
0
4
@louvishh
lovish
1 month
@billyuchenlin @togethercompute mmlu redux seems low, can you inspect/share the inputs/raw model outputs?
2
0
4
@louvishh
lovish
4 months
the uk doesn't know how to name its holidays. what the hell is a bank holiday?!
Tweet media one
0
0
4
@louvishh
lovish
1 month
@markchen90 patience mark, patience 😛
0
0
4
@louvishh
lovish
2 years
@Pankajstocks still have those for loose/baggy outfits 😛
1
0
4
@louvishh
lovish
1 year
the only silver lining about living in bangalore is the good people living in this pathetic city. everything else, from traffic to water to infra, is abysmal.
@rasagy
Rasagy Sharma
1 year
No picnics, no photography, no playing sports in Cubbon Park? Bangaloreans are being policed in our most beloved green space. Join the campaign on @Jhatkaadotorg to urge the Horticulture Department to roll back these bizarre rules! RT & sign please! 🙇
19
163
395
0
0
3
@louvishh
lovish
1 year
sorry, i guess it's peakxvpartners now lol
1
0
3
@louvishh
lovish
9 months
aditya’s the best out there - great researcher, and the most helpful and kind person!!
@adityakusupati
Aditya Kusupati
9 months
📢📢At the last minute, I decided to go on the job market this year!!! Grateful for RTs & promotion at your univ.😇 CV & Statements: Will be at #NeurIPS2023 ! presenting AdANNS, Priming, Objaverse & MADLAD. DM if you are around, would love to catch up👋
2
49
181
0
0
3
@louvishh
lovish
27 days
@khoomeik openreview is down 🥲
1
0
3
@louvishh
lovish
2 years
the cherry on the top is the badal family and the captain losing their respective seats. total decimation.
0
0
3
@louvishh
lovish
3 months
Tweet media one
@leopoldasch
Leopold Aschenbrenner
3 months
Virtually nobody is pricing in what's coming in AI. I wrote an essay series on the AGI strategic picture: from the trendlines in deep learning and counting the OOMs, to the international situation and The Project. SITUATIONAL AWARENESS: The Decade Ahead
Tweet media one
Tweet media two
264
877
4K
0
0
3
@louvishh
lovish
2 years
i’m not crying. you are 🥺😭
@rogerfederer
Roger Federer
2 years
Tomorrow night. My last match. Doubles with @RafaelNadal 💪🏽❤️
2K
26K
217K
0
0
3
@louvishh
lovish
4 months
@osanseviero thanks for all your amazing work Omar! have a safe flight!
0
0
3
@louvishh
lovish
1 month
@robdadashi we use the mmlu number from the gemma report because we were getting a lower number using our internal evals. gemma was not following the instructions properly and was using a lot of ** ** to enclose text. and this is 5-shot mmlu, 0 shot was even lower.
1
0
3
@louvishh
lovish
2 years
craving a bhatura with some chhole rn.
0
0
3
@louvishh
lovish
1 year
@nsaphra @andrewgwils folks probably don’t know this but conference organizers have to submit a list of every single attendee to the ministry of external affairs for approval, which is a big pain. colt organizers in bangalore had to go through this.
0
0
3
@louvishh
lovish
2 years
there’s nothing worse than seeing someone with the same name as you doing idiotic stuff online.
0
0
3
@louvishh
lovish
1 year
looking to transfer volt gym membership in indiranagar. dm me if you want it.
2
0
2
@louvishh
lovish
2 years
@kritipraks very aesthetic ✨ want to borrow some of those books too 🤭
1
0
2
@louvishh
lovish
4 months
@shengs1123 didn't know you were on twitter lol
0
0
2
@louvishh
lovish
2 months
@LChoshen One obvious difference is reliability is looking at rankings of different "trained-out" models, while we compute the variance across seeds during pre-training of the same model, where there’s no obvious way to establish a ranking.
1
0
2
@louvishh
lovish
3 years
@sansiddh wow … did not expect my friday to start like this.
0
0
2
@louvishh
lovish
1 month
@jxmnop not everyone at meta uses slurm 😛
0
0
2
@louvishh
lovish
2 years
@move_4_7 here you go:
@louvishh
lovish
2 years
since a lot of you’ve been asking, for the workouts, i focused on both strength training and cardio. cardio is important for burning calories and weights for building muscle. my average workout routine is 30 mins cardio + 35-45 mins strength training for 5/6 days a week.
1
0
34
0
0
2
@louvishh
lovish
2 years
@Mizzling_Gaze the first 1/2 months are difficult, but wfh helped a lot in following a consistent routine during that period. and when i started seeing the progress after these initial couple of months, it motivated me to keep going.
1
0
2
@louvishh
lovish
1 month
@eugeneyan not just llama 2, we use synthetic data from intermediate and expert llama 3 models as detailed here:
Tweet media one
0
0
2
@louvishh
lovish
1 year
also, it's quite funny seeing all the rage against sam. i've lived in blr for over an year now, and i have yet to meet someone truly passionate about solving some problem. so, there's a mindset problem, and folks in startups need to worry less about retiring with lots of money.
1
0
2
@louvishh
lovish
29 days
@giffmana ahh sorry about that, out of my hands 🙈
1
0
2
@louvishh
lovish
2 years
best twitter bot 🤌🏻
Tweet media one
@PayGapApp
Gender Pay Gap Bot
2 years
In this organisation, women's median hourly pay is 45.1% lower than men's.
26
430
2K
1
0
2
@louvishh
lovish
1 year
@Eepsita my flatmates and i are moving out of a 3bhk in indiranagar (7th main near 100 ft road). let me know if you’d be interested in that. our landlord is quite sweet too!
2
0
2
@louvishh
lovish
2 years
0
0
2
@louvishh
lovish
5 months
@abhi_venigalla @BlancheMinerva can confirm this. the mixtral paper reports 59.7 for arc-c (which is using the normal eval setup) and 85.8 using 25-shot mmlu-style prompt. mistral-7b is 54.3 and 78.5 respectively.
0
0
1
@louvishh
lovish
3 months
@RylanSchaeffer sorry to hear that :/
0
0
1
@louvishh
lovish
2 years
@gudda1997 woohoo 🙌🏻
1
0
1
@louvishh
lovish
27 days
@adityakusupati @uwcse @GoogleDeepMind woohoo 🎉 congratulations 🥂
1
0
1
@louvishh
lovish
2 years
@sansiddh thank you!!! 😁
0
0
1
@louvishh
lovish
3 years
@jashansuri Horner right now
1
0
1
@louvishh
lovish
2 years
@kritipraks hahaha … should’ve written 100 .. the median amount 😂
0
0
1
@louvishh
lovish
2 years
@rishubhsingh135 @vikataravi @CVPR woohoo 🎉 congrats rishubh!!
0
0
1
@louvishh
lovish
3 years
@jashansuri uber also has a rental service.
0
0
1
@louvishh
lovish
3 years
@sansiddh lewis would've lost track position if they did that 🙁 ... this was truly a colossal fuck up by masi.
1
0
1
@louvishh
lovish
2 years
@elonmusk
Elon Musk
2 years
I hereby challenge Владимир Путин to single combat Stakes are Україна
35K
48K
362K
1
0
1