Jiaxin Shi @thjashin profile

Jiaxin Shi

@thjashin

Followers

3K

Following

1K

Statuses

542

Research Scientist @GoogleDeepMind | prev @Stanford @MSRNE @VectorInst @RIKEN_AIP_EN @Tsinghua_Uni. Building probabilistic & algorithmic models for learning.

New York, NY

Joined February 2016

Don't wanna be here? Send us removal request.

Jiaxin Shi

@thjashin

2 months

We have released code for our paper "Simplified and Generalized Masked Diffusion for Discrete Data" — SOTA discrete diffusion results — beating prior diffusion language models & exceeding AR likelihood on pixel-level image modeling. Try it out:

3

41

241

Jiaxin Shi

@thjashin

16 hours

@aaron_lou @srush_nlp Save me with discrete diffusion model that fixes all these

0

Jiaxin Shi

@thjashin

17 hours

@srush_nlp just checked I had a sum of 30 numbers it gives 2074.74 vs truth 2070.63 - not too bad

1

0

Jiaxin Shi

@thjashin

2 days

@srush_nlp Are these models good enough for summations? I used them for tax calculations..do I need to double check

1

0

5

Jiaxin Shi

@thjashin

2 days

RT @yeewhye: We're looking for an exceptional junior researcher in AI/ML with strong interests in diversity, equity and inclusion to fill t…

0

22

0

Jiaxin Shi

@thjashin

3 days

RT @krizna_b: Happy to have this paper on Improved rates for Stein Variational Gradient Descent accepted as an oral presentation at #ICLR20…

0

3

0

Jiaxin Shi

@thjashin

3 days

RT @sitanch: Excited about this new work where we dig into the role of token order in masked diffusions! MDMs train on some horribly hard t…

0

16

0

Jiaxin Shi

@thjashin

3 days

RT @pengzhangzhi1: New Paper Alert! 🚀 We introduce Path Planning (P2), a sampling approach to optimizing token unmasking order in Masked D…

0

19

0

Jiaxin Shi

@thjashin

11 days

RT @lqiang67: 🚀 New Rectified Flow materials (WIP)! 📖 Tutorials: 💻 Code: 📜 Notes: https://…

0

40

0

Jiaxin Shi

@thjashin

11 days

Still a week to submit your work to ICLR 2025 workshop on world models!

Mengyue Yang

@Mengyue_Yang_

11 days

🚀 Call for Papers: ICLR 2025 Workshop on World Models! 🌍🤖 📅 Submission Deadline: 10th Feb 2025 23:59 AOE 🌐 Website: We invite submissions on understanding, modeling, and scaling #WorldModels—from knowledge extraction to model-based RL, multimodal world models, and their applications in AI, robotics, and scientific discovery. 📩 Join us in shaping the future of AI-driven world modeling! #ICLR2025 #WorldModels #AI #ML #RL

0

3

10

Jiaxin Shi

@thjashin

13 days

RT @Joshua_Bambrick: 𝗡𝗲𝘂𝗿𝗜𝗣𝗦 𝟮𝟬𝟮𝟰: 𝗗𝗶𝗳𝗳𝘂𝘀𝗶𝗼𝗻 𝗧𝗵𝗲𝗺𝗲𝘀 𝗮𝗻𝗱 𝗠𝗲𝗺𝗲𝘀 📝 The moment you've all been waiting for... I have a blog! The first post s…

0

21

0

Jiaxin Shi

@thjashin

19 days

RT @leloykun: (Linear) Attention Mechanisms as Test-Time Regression By now, you've probably already heard of linear attention, in-context…

0

76

0

Jiaxin Shi

@thjashin

23 days

RT @avdnoord: Our image model is on LMSYS : ) It's been an amazing effort by the team, I'm very proud of what we achieved over the last ye…

0

50

0

Jiaxin Shi

@thjashin

23 days

There has been so much progress in efficient sequence models in recent years! If you are also overwhelmed by the numerous architectures proposed, we wrote one paper for you that unifies the idea behind them and highlights potential generalizations.

Alex Wang

@heyyalexwang

23 days

did you know you've been doing test-time learning this whole time? transformers, SSMs, RNNs, are all test-time regressors but with different design choices we present a unifying framework that derives sequence layers (and higher-order attention👀) from a *single* equation 🧵

1

6

41

Jiaxin Shi

@thjashin

23 days

RT @SonglinYang4: A very cool new paper closely aligns with the content of my slides

0

25

0

Jiaxin Shi

@thjashin

2 months

RT @SJTUDengLab: 🤯 Existing MLLMs struggle to achieve three benefits simultaneously: - Lossless visual signal for understanding and generat…

0

5

0

Jiaxin Shi

@thjashin

2 months

Excited to co-organize this workshop that brings together great minds on generative modeling, robotics, and causality to discuss the next challenge in world models.

Mengyue Yang

@Mengyue_Yang_

2 months

🎤 World Models Workshop at #ICLR2025! 🌍✨ We invited the foundation World Models #Google #Genie key contributors, Open-Endedness Team Lead Tim Rocktäschel @_rockt and Jack Parker-Holder @jparkerholder to share their groundbreaking work with us! 🙌 🔥 Keynote Highlights: 🤖 We are honoured to invite an outstanding lineup of speakers and panellists! 🎉 Experts in #Robotics, Reinforcement Learning #RL, Deep Generative Models #GenerativeAI, and #Causality will join us to share their groundbreaking work and dive deep into the next generation of World Models. Jeff Clune @jeffclune (UBC & CIFAR AI) Tim Rocktäschel @_rockt (UCL & Google DeepMind) Stefano Ermon @StefanoErmon (Stanford University) Chelsea Finn @chelseabfinn (Stanford University & Physical Intelligence) Jakob Foerster @j_foerst (Oxford University & Meta) Kun Zhang @kunkzhang (CMU & MBZUAI) Furong Huang @furongh (University of Maryland) Xiaolong Wang @xiaolonw (UCSD) Tom Everitt @tom4everitt (Google DeepMind) David Ha @hardmaru (Sakana AI) Jack Parker-Holder @jparkerholder (Google DeepMind) 📆 Submission deadline: Feb 3, 2025 📍 Workshop: April 27, 2025 | Singapore More info & submission details: 👉 #WorldModels #ICLR #AI

0

2

39

Jiaxin Shi

@thjashin

2 months

RT @sedielem: Here's Veo 2, the latest version of our video generation model, as well as a substantial upgrade for Imagen 3 🧑‍🍳🚢 (Did I me…

0

32

0

Jiaxin Shi

@thjashin

2 months

RT @NeurIPSConf: Please read our statement on the remarks made by Dr. Rosalind Picard at her NeurIPS 2024 invited talk and our commitment t…

0

131

0

Jiaxin Shi

@thjashin

2 months

RT @ArnaudDoucet1: The slides of my NeurIPS lecture "From Diffusion Models to Schrödinger Bridges - Generative Modeling meets Optimal Trans…

0

157

0

Jiaxin Shi

@thjashin

2 months

RT @Mengyue_Yang_: 🚀 Excited to announce our World Models: Understanding, Modelling and Scaling Workshop at #ICLR2025! 🎉 Keynote speakers,…

0

14

0