CFGeek Profile Banner
Charles Foster Profile
Charles Foster

@CFGeek

Followers
3K
Following
17K
Media
479
Statuses
5K

Excels at reasoning & tool use🪄 Tensor-enjoyer 🧪 @METR_Evals. My COI policy is available under “Disclosures” at https://t.co/bihrMIUKJq

Oakland, CA
Joined June 2020
Don't wanna be here? Send us removal request.
@CFGeek
Charles Foster
2 years
Running list of conjectures about neural networks 📜:
6
10
149
@CFGeek
Charles Foster
2 days
Researchers at FAIR were way ahead of their time working on this back in 2019! Excited to hear from more folks who are exploring cool new directions out of Meta
@realJessyLin
Jessy Lin
3 days
As part of our recent work on memory layer architectures, I wrote up some of my thoughts on the continual learning problem broadly: Blog post: https://t.co/HNLqfNsQfN Some of the exposition goes beyond mem layers, so I thought it'd be useful to highlight separately:
1
9
144
@CFGeek
Charles Foster
3 days
Almost all pick heads. I've seen Gemini models pick tails, but it looks like that may be using tool-calls.
@CFGeek
Charles Foster
4 days
Go to an LLM and just type "Flip a coin" in a fresh context. Report back the result. Testing something out.
2
0
4
@sebkrier
Séb Krier
4 days
@CFGeek lol
0
1
19
@CFGeek
Charles Foster
4 days
Go to an LLM and just type "Flip a coin" in a fresh context. Report back the result. Testing something out.
42
1
25
@CFGeek
Charles Foster
4 days
Here’s more prior work from 2022 wherein @eunbi__choi et al. re-discovered context distillation (likely independently) and called it “prompt injection”: https://t.co/jyntsX316m That unfortunately clashes w/ a popular term coined by @simonw, although maybe it is the earlier one.
Tweet card summary image
arxiv.org
Recent works have shown that attaching prompts to the input is effective at conditioning Language Models (LM) to perform specific tasks. However, prompts are always included in the input text...
0
0
4
@CFGeek
Charles Foster
5 days
According to the authors, this was an accidental re-invention rather than an intentional re-brand: https://t.co/qHBUN1XbmZ
@witkowski_cam
Cameron Witkowski
5 days
Had not heard of context distillation when we wrote the paper back in 2024 but this is great stuff & way ahead of its time! Our initial paper showed us that prompts could in principle be converted into weight updates — and surprisingly fast with new advances like LoRA, and
0
0
13
@CFGeek
Charles Foster
5 days
For the record, I’ve always thought that context distillation was neat. That’s why I care about properly crediting those who developed it and also why I’m excited to see folks like Cameron building on it!
@witkowski_cam
Cameron Witkowski
5 days
Had not heard of context distillation when we wrote the paper back in 2024 but this is great stuff & way ahead of its time! Our initial paper showed us that prompts could in principle be converted into weight updates — and surprisingly fast with new advances like LoRA, and
0
0
15
@CFGeek
Charles Foster
5 days
It appears that in 2024 the now-cofounders of Bread Technology somehow re-discovered context distillation as “prompt baking” and released a paper on it: https://t.co/AwpbkncUAj
Tweet card summary image
arxiv.org
Two primary ways to change LLM behavior are prompting and weight updates (e.g., fine-tuning). Prompting LLMs is simple and effective, specifying the desired changes explicitly in natural language,...
6
1
48
@CFGeek
Charles Foster
5 days
Just read their paper. Looks like they re-invented an existing method known as context distillation (or merely re-branded it for their startup). No mention of prior work, sadly. Links to papers in thread.
@ai_bread
Bread
6 days
Announcing Bread Technologies. We’re building machines that learn like humans. We raised a $5 million seed round led by Menlo Ventures and have been building in stealth for 10 months. Today, we rise 🍞
22
18
517
@CFGeek
Charles Foster
8 days
Funnily enough, this Anthropic co-founder gave a talk that Sonnet 4.5 can't engage with. Mentions of bioweapons trigger its safety filters.
@jackclarkSF
Jack Clark
11 days
Technological Optimism and Appropriate Fear - an essay where I grapple with how I feel about the continued steady march towards powerful AI systems. The world will bend around AI akin to how a black hole pulls and bends everything around itself.
6
11
208
@CFGeek
Charles Foster
14 days
I didn’t get how this cover mapped onto the Transformer architecture how until I saw this website that seemingly inspired the design:
@stripepress
Stripe Press
14 days
Evolution of the Scaling Era cover: “It’s hard to find unique ways of visualizing AI without defaulting to the obvious,” says @pablodelcan. “In our design process, we experimented with unexpected metaphors—flowers growing out of neural networks, abstract mathematical puzzles
0
2
13
@CFGeek
Charles Foster
14 days
Inoculation works even without prompting or activation steering. You can also create an inoculated model by training a teacher to exemplify the undesired property, then finetuning a student on the usual dataset while adding the teacher-reference logit difference to its outputs.
@saprmarks
Samuel Marks
16 days
New paper & counterintuitive alignment method: Inoculation Prompting Problem: An LLM learned bad behavior from its training data Solution: Retrain while *explicitly prompting it to misbehave* This reduces reward hacking, sycophancy, etc. without harming learning of capabilities
1
0
0
@CFGeek
Charles Foster
14 days
Inoculation doesn’t require prompting or activation steering. You can also create an inoculated model by training a teacher to exemplify the undesired property, then finetuning the student on the normal dataset while adding the teacher-reference logit difference to its logits.
@saprmarks
Samuel Marks
16 days
New paper & counterintuitive alignment method: Inoculation Prompting Problem: An LLM learned bad behavior from its training data Solution: Retrain while *explicitly prompting it to misbehave* This reduces reward hacking, sycophancy, etc. without harming learning of capabilities
0
0
2
@CFGeek
Charles Foster
15 days
This work is exciting because it shows we might be able to steer how models generalize from our SFT demonstrations. What’d be even more exciting is showing we can steer how models generalize from their RL trajectories!
@DanielCHTan97
Daniel Tan
16 days
New paper! Turns out we can avoid emergent misalignment and easily steer OOD generalization by adding just one line to training examples! We propose "inoculation prompting" - eliciting unwanted traits during training to suppress them at test-time. 🧵
2
1
15
@SydneyVonArx
Sydney
17 days
METR's time horizon data doesn't mean you should predict "every model that comes out will be on-trend". Most models don't push the frontier. I expect ~5 models each year do that. Folks should falsify our trend by seeing if it holds over ~a quarter, not for specific models unless
@peterwildeford
Spooky Peter Wildeford🎃👻🇺🇸🚀
17 days
There's a narrative that GPT5 has proven the end of scaling. This is false. Claude 4.5 gives us another opportunity to see how AI trends are holding up. We can project current trends and compare. I forecast @METR_Evals will find Claude 4.5 to have a 2-4h time horizon.
4
6
73
@CFGeek
Charles Foster
15 days
@METR_Evals
METR
15 days
We estimate that Claude Sonnet 4.5 has a 50%-time-horizon of around 1 hr 53 min (95% confidence interval of 50 to 235 minutes) on our agentic multi-step software engineering tasks. This estimate is lower than the current highest time-horizon point estimate of around 2 hr 15 min.
0
0
3
@METR_Evals
METR
15 days
We estimate that Claude Sonnet 4.5 has a 50%-time-horizon of around 1 hr 53 min (95% confidence interval of 50 to 235 minutes) on our agentic multi-step software engineering tasks. This estimate is lower than the current highest time-horizon point estimate of around 2 hr 15 min.
18
73
638