Rick Lamers
@RickLamers
Followers
5K
Following
3K
Statuses
2K
π¨βπ» AI Research & Engineering @GroqInc. Occasional angel investor. I publish technical resources about LLMs every week. Opinions are my own.
Join 7,237+ readers β
Joined July 2009
"if you can compress data well, you understand its patterns" Low loss on training error + highly compressible hypothesis = strong generalization Large models tend to find compressible hypotheses in their weight space because of the optimization process (gradient descent) and because the training distribution can actually be compressed well. (Language is highly regular, i.e. a low length hypothesis/model exists.)
1
0
21
RT @GavinSherry: Huge news here in Saudi Arabia β amazing partnership between @GroqInc and Saudi Arabia πΈπ¦
0
28
0
@TrelisResearch Yeah if RL on LLM base is learning how to reason for LLMs I hope their reasoning gets better over time too not just that it learns how to reason for longer (albeit coherently/productively β which to be fair also isn't trivial).
0
0
2
RT @mbalunovic: We finally have an answer to the debate over whether LLMs generalize to new math problems or they merely memorized the answβ¦
0
162
0
Ah, did find: >Software Engineering Tasks: Due to the long evaluation times, which impact the efficiency of the RL process, large-scale RL has not been applied extensively in software engineering tasks. As a result, DeepSeek-R1 has not demonstrated a huge improvement over DeepSeek-V3 on software engineering benchmarks. Future versions will address this by implementing rejection sampling on software engineering data or incorporating asynchronous evaluations during the RL process to improve efficiency.
0
0
0
@TrelisResearch @anaisbetts >things probably collapse into being one agent I feel like encapsulation and information/context overload contradict this. How to organize I don't know, but "putting everything into one thing" feels like it won't scale.
0
0
1
I like to think this is inspired by the human brain that is pretty sample efficient also. Evolution = pre-training. Eduction = supervised learning. Trial and error self teaching = RL.
Turns out you don't need that much data or compute to get a reasoning model from a high quality base model. @ylecun was right
0
0
1
RT @hila_chefer: VideoJAM is our new framework for improved motion generation from @AIatMeta We show that video generators struggle with mβ¦
0
192
0