![Julius Adebayo Profile](https://pbs.twimg.com/profile_images/1475120065104257024/M1BJtExc_x96.jpg)
Julius Adebayo
@juliusadml
Followers
2K
Following
2K
Statuses
3K
Building @guidelabsai - Engineering interpretable models and agents that are easy to audit. PhD in ML @MITEECS, Ex @Meta, Google Brain, & Prescient Design.
San Francisco, CA
Joined July 2011
RT @nabeelqu: Trying to make any kind of "agent" work in a real enterprise is extremely blackpilling. Basically turns you into Gary Marcus.
0
19
0
RT @rdolmedo_: Self-reflection is not unique to “reasoning models” or to newer models. Here are some self-reflections produced by Llama 2…
0
3
0
RT @albertwenger: @elder_plinius Yet more evidence (as if that was needed) that if we want to have any hope of inner, non-resented alignmen…
0
2
0
RT @cloneofsimo: What students expect from ML job: analysis of sharpness effecting generalization bound analysis of NTK parameterization o…
0
24
0
RT @charliermarsh: We’re building a new static type checker for Python, from scratch, in Rust. From a technical perspective, it’s probably…
0
215
0
RT @jxmnop: most important thing we learned from R1? that there’s no secret revolutionary technique that’s only known by openAI. no magic…
0
109
0
RT @percyliang: While we celebrate @deepseek_ai 's release of open-weight models that we can all play with at home, just a friendly reminde…
0
467
0
RT @marimo_io: Sharing notebooks with data files is far harder than it should be. That's why we're announcing a new way to share Python no…
0
10
0
Excited to continue to @guidelabsai journey with @asalam_91. Interpretable models go brrr!
Exciting news! I've left Prescient to co-found @guidelabsai an interpretability startup with @juliusadml. We're building interpretable foundational models to address key AI challenges. While I loved my time at Prescient, I'm thrilled to build something I'm very passionate about! We're hiring! If you're an ML interpretability researcher, ML engineer, frontend engineer, or are generally curious about what we're doing, please reach out!
0
0
7
RT @matthewjmandel: The Deep Tech Opportunity Far from permanently redefining venture capital, the software moment of the last thirty year…
0
16
0
RT @rdolmedo_: My favorite figure: Pythia performs at random chance on MMLU and GSM8K irrespective of scale. In contrast, Llama and Qwen s…
0
2
0
Really cool paper questioning all the 'incredible' progress we've seen recently: "after fine-tuning all models on the same amount of task data, performance per pre-training compute equalizes and newer models are no better than earlier models."
Models released after November 2023 strongly outperform earlier ones on MMLU and GSM8K. However, after fine-tuning all models on the same amount of task data, performance per pre-training compute equalizes and newer models are no better than earlier models.
0
4
24
RT @jeremyphoward: SW eng manager: No real work gets done in Jupyter notebooks. Alex Radford: I invented GPT and CLIP in Jupyter notebooks.
0
202
0
People seem outraged about this. It is simple: LLMs have bulldozed the test vs train division we used to hold sacred in machine learning. A quote from the original test-time training paper (: "we hope this paper can encourage researchers to abandon the self-imposed constraint of a fixed decision boundary for testing, or even the artificial division between training and testing altogether." See this important talk for more discussion: GPT-3 ushered in a brave new world 😉.
Raising visibility on this note we added to address ARC "tuned" confusion: > OpenAI shared they trained the o3 we tested on 75% of the Public Training set. This is the explicit purpose of the training set. It is designed to expose a system to the core knowledge priors needed to beat the much harder eval set. The idea is each training task shows you an isolated single prior. And the eval set requires you to recombine and abstract from those priors on the fly. Broadly, the eval tasks require utilizing 3-5 priors. The eval sets are extremely resistant to just "memorizing" the training set. This is why o3 is impressive.
0
0
3