![David Hershey Profile](https://pbs.twimg.com/profile_images/1602717705429098496/qU6MLyOh_x96.jpg)
David Hershey
@DavidSHershey
Followers
941
Following
4K
Statuses
208
AI Generalist | Writer of https://t.co/l1jTizWyTv
Seattle, WA
Joined April 2017
RT @AnthropicAI: New Anthropic research paper: Scaling Monosemanticity. The first ever detailed look inside a leading large language model…
0
562
0
RT @AnthropicAI: Introducing a new Team plan for Claude. Get increased usage for team members, easily manage users and billing, and tackle…
0
80
0
RT @swyx: fascinating read on finetuning this am: a finetuned 7B model can beat GPT-4 on Magic the Gathering drafting but more importantl…
0
9
0
@swyx Glad you enjoyed it! This OpenAI bill is the closest I've gotten yet to buying a 4090 for my home 😅
1
0
1
@HamelHusain @hugobowne Spent a lot of time fine-tuning models in the last few weeks, and oh boy did it feel like ML in all of the hard ways - mostly data work, lots of experiments to see what data was effective. Takeaway was it seems to depend on how important you think fine-tuning is going forward.
0
0
1
What an awesome view into why training LLMs requires so much high-quality talent. "This level of perfection is like eight billion people copy[ing] the complete works of Shakespeare for the 14 billion years the universe has existed and not have a single person make a mistake!"
If your loss curves look sus, join the club! Giant LLM training runs are full of pitfalls. We learned the hard way. We wrote a deep dive for the community on silent data corruptions (SDCs). Problem and mitigations here:
0
1
1
RT @jyotibansalsf: Excited to share that @Unusual_VC is opening the next round of Unusual Academy — a hands-on program to equip seed-stage…
0
5
0
RT @MF_FOOM: mf trained a simple model to translate ada-002 embeddings back to text and found something interesting: sentence embeddings h…
0
167
0
@MosaicML Awesome work! Really appreciate breaking out evaluation into more human-compatible categories; I think that's exactly what we need to be able to reason about new models.
0
0
0
RT @MosaicML: How can the ML community measure LLM quality in a holistic and standardized manner? The Mosaic Model Gauntlet encompasses 34…
0
41
0
RT @haroonchoudery: Last week, a paper talking about massive drops in GPT-4's performance went viral. - GPT-4's accuracy dropped from 97.6…
0
1
0
RT @bio_bootloader: Introducing Mentat - an open source, GPT-4 powered coding assistant! Mentat runs in your command line, giving it the c…
0
154
0