Dan Biderman Profile
Dan Biderman

@dan_biderman

Followers
1K
Following
7K
Statuses
1K

AI & neuroscience. Postdoc @Stanford w/ @HazyResearch & @scott_linderman. Prev: computational neuroscience PhD @cu_neurotheory, @DbrxMosaicAI

Palo Alto, CA
Joined March 2017
Don't wanna be here? Send us removal request.
@dan_biderman
Dan Biderman
9 months
People think LoRA is a magic bullet for LLMs. Is it? Does it deliver the same quality as full finetuning but on consumer GPUs? Though LoRA has the advantage of a lower memory footprint, we find that it often substantially underperforms full finetuning. However, it forgets less of the base model’s capabilities. In this work, we exhaustively explore this trade-off and provide practitioners a clear view of the difference between the methods.
Tweet media one
22
104
560
@dan_biderman
Dan Biderman
7 hours
RT @docmilanfar: The Kalman Filter was once a core topic in EECS curricula. Given its relevance to ML, RL, Ctrl/Robotics, I'm surprised tha…
0
55
0
@dan_biderman
Dan Biderman
9 hours
@allen_ai @soldni Congrats — awesome work, super impressive
1
0
1
@dan_biderman
Dan Biderman
9 hours
RT @allen_ai: We took our most efficient model and made an open-source iOS app📱but why? As phones get faster, more AI will happen on devic…
0
81
0
@dan_biderman
Dan Biderman
1 day
RT @jxmnop: surreal time capsule from what things were like at OpenAI exactly six years ago this was a really really good bet
0
7
0
@dan_biderman
Dan Biderman
1 day
@neuro_kim @pcastr Really refreshing approach. Congrats!!
0
0
2
@dan_biderman
Dan Biderman
2 days
RT @iScienceLuvr: Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach We study a novel language model architect…
0
177
0
@dan_biderman
Dan Biderman
2 days
@ericzakariasson Feels like the future is here. Congrats on the terrific features.
0
0
0
@dan_biderman
Dan Biderman
4 days
@cojennin @DbrxMosaicAI Your explanation of the sanitary/sewage system was unforgettable.
0
0
1
@dan_biderman
Dan Biderman
5 days
RT @SuryaGanguli: My @TEDAI2024 talk is out! I discuss our work, spanning AI, physics, math & neuroscience, to deve…
0
40
0
@dan_biderman
Dan Biderman
5 days
@SuryaGanguli Enjoyed listening. Loved the last part about open & interdisciplinary science of intelligence
0
0
1
@dan_biderman
Dan Biderman
6 days
@danielhanchen @UnslothAI Will play around with it!!
1
0
1
@dan_biderman
Dan Biderman
6 days
@ishapuri101 @variational_i @xukai92 @GX_NLP @shivsr98 congrats, very interesting!
0
0
0
@dan_biderman
Dan Biderman
9 days
@jxmnop Agreed 100%
0
0
1
@dan_biderman
Dan Biderman
9 days
RT @_jasonwei: Very excited to finally share OpenAI's "deep research" model, which achieves twice the score of o3-mini on Humanity's Last E…
0
192
0
@dan_biderman
Dan Biderman
9 days
Cool take!
@NaveenGRao
Naveen Rao
9 days
Prediction: all closed AI model providers will stop selling APIs in the next 2-3 years. Only open models will be available via APIs. Why? For an open model service, the value prop is clear...it's hard to build a scalable service to access the model and the model is commodity. The race-to-the bottom happened with the commodity already (model). Let AI app builders iterate on great UIs for apps upon scalable services with commodity capabilities Closed model providers are trying to build non-commodity capabilities and they need great UIs to deliver those. It's not just a model anymore, but an app with a UI for a purpose. If closed models are available via API, all it does is create competition for the app the closed provider is building. The secret sauce is capabilities + UI.
0
0
1
@dan_biderman
Dan Biderman
9 days
RT @NaveenGRao: Prediction: all closed AI model providers will stop selling APIs in the next 2-3 years. Only open models will be available…
0
50
0
@dan_biderman
Dan Biderman
9 days
@NaveenGRao Really cool take
0
0
1
@dan_biderman
Dan Biderman
9 days
@DimitrisPapail that's why i'm always waiting as a person
0
0
3
@dan_biderman
Dan Biderman
9 days
@abeirami Very interesting! Congrats!
1
0
1
@dan_biderman
Dan Biderman
9 days
RT @abeirami: 𝐛𝐞𝐬𝐭-𝐨𝐟-𝐧 is a strong baseline for - improving agents - scaling inference-time compute - preference alignment - jailbreakin…
0
48
0