Tolga Bolukbasi @tolgab0 X Profile

Tolga Bolukbasi

@tolgab0

Followers

302

Following

417

Media

3

Statuses

92

AI research/Gemini pretraining @GoogleDeepmind, PhD, opinions my own.

Joined November 2014

Don't wanna be here? Send us removal request.

Tolga Bolukbasi

@tolgab0

7 months

Our work on scaling training data attribution is out. There are a lot of insights in there, I especially like the distinction between attribution and influence. Thanks to our amazing student researcher Tyler for making this happen.

Tyler Chang

@tylerachang

7 months

We scaled training data attribution (TDA) methods ~1000x to find influential pretraining examples for thousands of queries in an 8B-parameter LLM over the entire 160B-token C4 corpus!.

0

1

11

Tolga Bolukbasi

@tolgab0

30 days

RT @sundarpichai: Our latest Gemini 2.5 Pro update is now in preview. It’s better at coding, reasoning, science + math, shows improved per….

0

751

0

Tolga Bolukbasi

@tolgab0

2 months

RT @tylerachang: Presenting our work on training data attribution for pretraining this morning: -- come stop by in….

0

5

0

Tolga Bolukbasi

@tolgab0

3 months

RT @NoamShazeer: This model’s “thinking” capabilities are driving major gains:. 🧑‍🔬Top performance on math and science benchmarks (AIME, GP….

0

18

0

Tolga Bolukbasi

@tolgab0

5 months

RT @suchenzang: "From Figure 3(a), it is apparent that many of the benchmarks we considered are substantially cont….

0

16

0

Tolga Bolukbasi

@tolgab0

6 months

RT @morteymike: @Nexuist I worked on the M series while at Apple. The main advantage that stuck out to me was actually that they were able….

0

486

0

Tolga Bolukbasi

@tolgab0

7 months

RT @andrew_ilyas: Machine unlearning ("removing" training data from a trained ML model) is a hard, important problem. Datamodel Matching (….

0

23

0

Tolga Bolukbasi

@tolgab0

7 months

I will be in ATTRIB workshop tomorrow (. Stop by if you’d like to chat with me and connect with other great researchers in this area.

0

1

9

Tolga Bolukbasi

@tolgab0

9 months

RT @esindurmusnlp: We'll present this at #NeurIPS.

0

13

0

Tolga Bolukbasi

@tolgab0

9 months

RT @JeffDean: Welcome, AlphaChip!. Today, we are sharing some exciting updates on our work published in @Nature in 2021 on using reinforcem….

0

313

0

Tolga Bolukbasi

@tolgab0

11 months

I have been thinking about this since ChatGPT came out. Using RLHF never fully made sense to me given how restricted it is compared to regular RL. There should be a way simpler non-exploring method to distill RM knowledge into the main model.

Andrej Karpathy

@karpathy

11 months

# RLHF is just barely RL. Reinforcement Learning from Human Feedback (RLHF) is the third (and last) major stage of training an LLM, after pretraining and supervised finetuning (SFT). My rant on RLHF is that it is just barely RL, in a way that I think is not too widely

0

1

Tolga Bolukbasi

@tolgab0

11 months

RT @JeffDean: We have an experimental updated version of Gemini 1.5 Pro that is #1 on the @lmsysorg Chatbot Arena. This model is a signifi….

0

108

0

Tolga Bolukbasi

@tolgab0

1 year

RT @melvinjohnsonp: Great to see Gemini 1.5 doing well on this new video understanding benchmark!.

0

4

0

Tolga Bolukbasi

@tolgab0

1 year

RT @zacharynado: sign up for the wait-list here

0

12

0

Tolga Bolukbasi

@tolgab0

1 year

It was great to work with Minsuk and excited to see this released. Looking at individual model outputs this way helps one see which examples/tasks are truly wins across model versions and which ones are just due to randomness of generation or raters.

Minsuk Kahng

@minsukkahng

1 year

Very excited to open-source LLM Comparator!.This new #visualization tool lets you analyze LLM responses side-by-side. It’s been used for evaluating LLMs @Google, and we're proud to release it as part of Google's Responsible GenAI Toolkit.

0

7

Tolga Bolukbasi

@tolgab0

1 year

RT @karpathy: Nice new read on tokenization!.You've heard about the SolidGoldMagikarp token, which breaks GPT-2 because it was present in t….

0

350

0

Tolga Bolukbasi

@tolgab0

1 year

RT @akyurekekin: Happy news: ICLL is accepted to ICML, 2024!.

0

10

0

Tolga Bolukbasi

@tolgab0

1 year

RT @kelvin_guu: Great new work from our team and colleagues at.@GoogleDeepMind! On the Massive Text Embedding Benchmark (MTEB), Gecko is th….

0

6

0

Tolga Bolukbasi

@tolgab0

1 year

RT @_rockt: I am really excited to reveal what @GoogleDeepMind's Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation….

0

552

0