Roberta Raileanu Profile
Roberta Raileanu

@robertarail

Followers
5K
Following
4K
Media
43
Statuses
1K

Research Scientist @Meta & Honorary Lecturer @UCL. Llama Tool Use Lead. ex @DeepMind | @MSFTResearch | @NYU | @Princeton. Llama-3, Toolformer, Rainbow Teaming.

London, UK
Joined April 2013
Don't wanna be here? Send us removal request.
@robertarail
Roberta Raileanu
6 months
It really takes a village…of Llama herders 🦙🦙🦙. Working on this has been an absolute privilege. Special shoutout to @mialon_gregoire and @sharathraparthy who worked tirelessly to bring tool use capabilities to Llama-3.1!!.
@AIatMeta
AI at Meta
6 months
Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context
5
3
60
@robertarail
Roberta Raileanu
4 months
I’m looking for a PhD intern for next year to work at the intersection of LLM-based agents and open-ended learning, part of the Llama Research Team in London. If interested please send me an email with a short paragraph with some research ideas and apply at the link below.
11
106
583
@robertarail
Roberta Raileanu
2 years
Jakob Foerster (@j_foerst) and I are hiring a PhD student for our FAIR-Oxford program to work at the intersection of language and RL. The student will spend 50% of their time @UniofOxford and 50% @MetaAI (FAIR), while completing a DPhil (Oxford PhD). Deadline: 1st of March.
13
78
364
@robertarail
Roberta Raileanu
2 years
Our group has multiple openings for internships at FAIR London (@MetaAI). I’m looking for someone to work on language models + decision making e.g. augmenting LMs with actions / tools / goals, interactive / open-ended learning for LMs, or RLHF. Apply at
10
67
336
@robertarail
Roberta Raileanu
3 years
Super excited to start as a Research Scientist at FAIR London today! . I’m looking for an intern for 2022, so if you’re interested in generalization and representation in RL, unsupervised or continual RL, consider applying via
22
15
331
@robertarail
Roberta Raileanu
1 year
My team is hiring interns to work on LLMs (part of Meta’s GenAI research lab):. If you’re interested in working together on augmenting LLMs with tools and decision making abilities, or open-ended and continual learning for LLMs, consider applying and don’t.
6
42
277
@robertarail
Roberta Raileanu
5 years
Excited to share our new paper “Automatic Data Augmentation for Generalization in Deep Reinforcement Learning” w/ @maxagoldstein8, @denisyarats, @ikostrikov, and @rob_fergus!. Paper: Code: Website:
2
49
218
@robertarail
Roberta Raileanu
5 years
Excited to share our @iclr_conf 2020 paper “RIDE: Rewarding Impact-Driven Exploration For Procedurally-Generated Environments”: w/ @_rockt at @facebookai! Code available at 1/5
Tweet media one
Tweet media two
Tweet media three
5
37
181
@robertarail
Roberta Raileanu
2 years
Our FAIR London team is hiring multiple Research Scientists to work on tool-augmented LLMs!. Apply here if interested:.
4
45
173
@robertarail
Roberta Raileanu
2 years
Struggling to keep up with all the recent papers on Augmented Language Models?. Check out our new survey on augmenting LLMs with reasoning, acting, and tool use.
@_akhaliq
AK
2 years
Augmented Language Models: a Survey. abs:
Tweet media one
2
25
146
@robertarail
Roberta Raileanu
2 years
CSP: our new algorithm for scalable continual reinforcement learning. CSP allows users to control the trade-off between performance and model size by adaptively learning a subspace of policies. SOTA on Continual World and Brax. Interactive Website:
@jb_gaya
Jean-Baptiste Gaya
2 years
An agent able to tackle a sequence of tasks, without forgetting, while keeping a reasonable model size: this is what we propose with Continual Subspace of Policies (CSP) 1/6
1
6
101
@robertarail
Roberta Raileanu
1 year
I’ll be #NeurIPS2023 from Tues evening until Sat. Look forward to meeting old friends and making new ones. Let me know if you want to chat about:.- augmenting LLMs with tools and decision-making.- open-ended and continual learning with / for LLMs. Here’s where you can find me:.
1
7
100
@robertarail
Roberta Raileanu
11 months
🚨 New paper 🚨 on teaching LLMs to reason with RL!. 🔎 We find that expert iteration (aka iterative rejection sampling) is a strong baseline, matching PPO's asymptotic performance and sample efficiency. 🙌 Kudos to @Dahoas1 who is a super productive researcher and produced not.
@Dahoas1
Alex Havrilla
11 months
🚨🚨🚨Paper #2 from my time at Meta!.In this work, we set out to understand how different algorithms fare at improving LLM reasoning from feedback. We compare expert iteration, PPO, and return-conditioned RL using Llama-2 as the base model.
Tweet media one
0
20
97
@robertarail
Roberta Raileanu
2 years
I’ll be at #NeurIPS2022 next week and would love to chat about RL, generalization, exploration, or its interaction with language. I’ll be presenting the following papers together with my co-authors:.
1
9
91
@robertarail
Roberta Raileanu
1 year
Can LLaMA-2 teach agents to play NetHack?. 🤖 Our new method, Motif, helps agents explore vast open-ended environments by distilling an LLM’s prior knowledge of the world. 🕹️ Motif learns an intrinsic reward model using LLM feedback over caption pairs, which is then used to.
@proceduralia
Pierluca D'Oro
1 year
Can reinforcement learning from AI feedback unlock new capabilities in AI agents?. Introducing Motif, an LLM-powered method for intrinsic motivation from AI feedback. Motif extracts reward functions from Llama 2's preferences and uses them to train agents with reinforcement
2
14
88
@robertarail
Roberta Raileanu
5 years
Delighted to give a 𝘀𝗽𝗼𝘁𝗹𝗶𝗴𝗵𝘁 talk about our work at the 𝘐𝘯𝘥𝘶𝘤𝘵𝘪𝘷𝘦 𝘉𝘪𝘢𝘴𝘦𝘴, 𝘐𝘯𝘷𝘢𝘳𝘪𝘢𝘯𝘤𝘦𝘴, 𝘢𝘯𝘥 𝘎𝘦𝘯𝘦𝘳𝘢𝘭𝘪𝘻𝘢𝘵𝘪𝘰𝘯 𝘪𝘯 𝘙𝘓 workshop #ICML2020 on Sat at 12:50pm UTC, followed by a poster session at 1:15pm UTC.
@robertarail
Roberta Raileanu
5 years
Excited to share our new paper “Automatic Data Augmentation for Generalization in Deep Reinforcement Learning” w/ @maxagoldstein8, @denisyarats, @ikostrikov, and @rob_fergus!. Paper: Code: Website:
1
12
78
@robertarail
Roberta Raileanu
2 years
Meet Toolformer, a language model that learns to use tools via self-supervision. At test time, Toolformer decides what tool to use and how based on the context, without any additional prompting or finetuning. If you’re interested in this area, we’re hiring interns!.
@timo_schick
Timo Schick
2 years
🎉 New paper 🎉 Introducing the Toolformer, a language model that teaches itself to use various tools in a self-supervised way. This significantly improves zero-shot performance and enables it to outperform much larger models. 🧰. 🔗 Link:
2
11
79
@robertarail
Roberta Raileanu
11 months
🚨 Introducing ToolVerifier, our new method for generalization to new tools. 🚀 ToolVerifier outperforms GPT-4 on some tasks despite being based on Llama-2 70b. 🤖 Key ideas: .- fine-tune on synthetically generated data for tool use with reasoning traces .- self-verification.
@jaseweston
Jason Weston
1 year
🚨New paper!🚨 .ToolVerifier. - Method to generalize to new tools .- Self-asks contrastive questions to select between best tools and parameter choices.- Fine-tuned on self-built synthetic data.- 22% performance improvement over few-shot baseline.🧵(1/4)
Tweet media one
0
13
76
@robertarail
Roberta Raileanu
1 year
🚨 We’re organizing a new #ICLR2024 workshop on Generative AI for Decision Making and submissions are open!. 🤖 The combination of Generative AI and Decision Making is emerging as one of the most promising paths towards generally capable agents, allowing.
@bogdan_mazoure
Bogdan Mazoure
1 year
🚨 The call for papers for our #ICLR2024 workshop on Generative AI for Decision Making is out! 🚨. We welcome both short (4 pages) and full (9 pages) technical or positional papers. Details: 1/3.
0
16
70
@robertarail
Roberta Raileanu
1 year
Super excited to join @UCL_DARK and get the chance to work (even more) closely with this very talented group!! 🚀.
@UCL_DARK
UCL DARK
1 year
We are super excited to announce that Dr Roberta Raileanu (@robertarail) and Dr Jack Parker-Holder (@jparkerholder) have joined @UCL_DARK as Honorary Lecturers! Both have done impressive work in Reinforcement Learning and Open-Endedness, and our lab is lucky to get their support.
Tweet media one
5
2
71
@robertarail
Roberta Raileanu
9 months
I’ll be at #ICLR2024 on Saturday for the LLM Agents Workshop Panel 🚀. Some of my collaborators will also be there throughout the week presenting our work on:.- how RLHF affects LLM generalisation and diversity with @_robertkirk.- training NetHack agents using LLM feedback with
Tweet media one
3
9
70
@robertarail
Roberta Raileanu
1 year
🤖 Want an agent that can learn new tasks from only a handful of demonstrations and no weight updates?. 🚀 Check out our new work on In-Context Learning for Sequential Decision-Making, where we show how we can use transformers to few-shot learn new Procgen and MiniHack tasks. 👋.
@sharathraparthy
Sharath Raparthy
1 year
🚨 🚨 !!New Paper Alert!! 🚨 🚨. How can we train agents that learn new tasks (with different states, actions, dynamics and reward functions) from only a few demonstrations and no weight updates?. In-context learning to the rescue!. In our new paper, we show that by training
0
11
70
@robertarail
Roberta Raileanu
4 years
Our work on “Decoupling Value and Policy for Generalization in Reinforcement Learning” w/ @rob_fergus will be presented as a long talk @icmlconf 2021. Come check it out in Poster Session 1 (Tuesday, 12 - 2pm ET / 5 - 7pm BST):
Tweet media one
1
16
66
@robertarail
Roberta Raileanu
5 years
Happy to finally release our work on “Fast Adaptation via Policy-Dynamics Value Functions” w/ @maxagoldstein8, Arthur Szlam, and @rob_fergus, which will be presented at #icml2020!. Paper: Code: Website:
1
17
65
@robertarail
Roberta Raileanu
3 years
Very cool work on unsupervised environment design! I really like the idea of an interactive demo where you can test agents on different environments in real time! I hope more RL papers will adopt this in the future, particularly if the focus is on generalization or robustness.
@jparkerholder
Jack Parker-Holder
3 years
Evolving Curricula with Regret-Based Environment Design. Website: Paper: TL;DR: We introduce a new open-ended RL algorithm that produces complex levels and a robust agent that can solve them (e.g. below). Highlights ⬇️! [1/N]
1
9
56
@robertarail
Roberta Raileanu
6 months
Check out @RL_Conference's Outstanding Paper Awards and the blog post on the process we used to decide. We awarded 7 papers that excel in one of the following aspects: .- Applications of Reinforcement Learning.- Empirical Reinforcement Learning Research.- Empirical.
@MarlosCMachado
Marlos C. Machado
6 months
In case you are not attending the @RL_Conference this year (I'm really sorry for you), @robertarail and I announced the list of RLC's Oustanding Paper Awards this year. If want to know the awarded papers, or the process we followed, read this piece:
0
8
59
@robertarail
Roberta Raileanu
5 months
🏆 Synthetic data generation is the new holy grail of AI. But ensuring the data is high-quality, interesting, and useful is non-trivial. 🤖 Our new method, Source2Synth provides a way of automatically generating and curating training examples (with and for LLMs 🦙) grounded in.
@jaseweston
Jason Weston
5 months
🚨New paper: Source2Synth🚨.- Generates synthetic examples grounded in real data .- Curation step makes data high quality based on answerability.- Improves performance on two challenging domains: Multi-hop QA and using tools: SQL for tabular QA .🧵(1/4)
Tweet media one
2
6
55
@robertarail
Roberta Raileanu
10 months
Can’t wait to see what you all build with Llama 3, the best openly available LLM!. Enjoy and stay tuned for more🦙🦙🦙.
@ml_perception
Mike Lewis
10 months
Excited to share a preview of Llama3, including the release of an 8B and 70B (82 MMLU, should be the best open weights model!), and preliminary results for a 405B model (still training, but already competitive with GPT4). Lots more still to come.
1
3
53
@robertarail
Roberta Raileanu
3 years
Very excited our workshop on Agent Learning in Open-Endedness (ALOE) was accepted at #ICLR2022! Look forward to bringing together a diverse community to discuss how we can create more open-ended learning systems.
@aloeworkshop
ALOE Workshop
3 years
Announcing the first Agent Learning in Open-Endedness (ALOE) Workshop at #ICLR2022! . We're calling for papers across many fields: If you work on open-ended learning, consider submitting. Paper deadline is February 25, 2022, AoE.
Tweet media one
0
5
52
@robertarail
Roberta Raileanu
2 years
CSP accepted as spotlight at #ICLR2023. Kudos to @jb_gaya who led this project and to all the other wonderful co-authors @doan_tl @LucasPCaccia @LaureSoulier @LudovicDenoyer.
@robertarail
Roberta Raileanu
2 years
CSP: our new algorithm for scalable continual reinforcement learning. CSP allows users to control the trade-off between performance and model size by adaptively learning a subspace of policies. SOTA on Continual World and Brax. Interactive Website:
4
10
49
@robertarail
Roberta Raileanu
2 years
Check out our new #ICML paper on “Hyperparameters in RL and How to Tune Them”, led by @The_Eimer who did an amazing job!. We recommend a set of best practices for reporting HP tuning in RL, which we hope will result in better reproducibility and more fair comparisons.
@The_Eimer
Theresa Eimer
2 years
No one likes fiddling with hyperparameters (HPs), especially in RL - but you don’t have to!.Our #ICML paper shows that HP optimization methods outperform 10x more compute-expensive hand tuning - and that underreporting is a reproducibility hazard. 🧵:
Tweet media one
Tweet media two
1
6
49
@robertarail
Roberta Raileanu
3 years
A very thorough, insightful, and much needed survey on generalization in deep RL!.
@_robertkirk
Robert Kirk
3 years
I'm very excited to share a Survey of Generalisation in Deep Reinforcement Learning (, written with @yayitsamyzhang, @egrefen and @_rockt. Curious about generalisation in RL? Then this is the survey for you! Here's a thread giving a quick overview.
Tweet media one
0
3
48
@robertarail
Roberta Raileanu
5 years
New approach for exploration in procedurally generated environments with sparse reward! AMIGo learns to set and achieve increasingly more challenging goals, leading to SOTA performance on MiniGrid.
@egrefen
Edward Grefenstette
5 years
Got a complicated RL exploration problem? Sparse/no reward? It's dangerous to go alone: bring an AMIGo! This thread introduces work done by Andres Campero, with @robertarail, Josh B. Tenenbaum, @HeinrichKuttler, @_rockt and me during Andres' internship at FAIR London. [1/5]
Tweet media one
0
7
47
@robertarail
Roberta Raileanu
6 months
We started this work back in 2022 with @sharathraparthy and @HenaffMikael when transformers only had 4k context length 😱. This was one of the main bottlenecks in extending our method to longer-horizon tasks like NetHack 👾 or Minecraft 🤖. I’m excited to see what can be achieved.
@_rockt
Tim Rocktäschel
6 months
Fantastic keynote talk by @robertarail at the AutoRL workshop on achieving in-context learning for complex environments like @NetHack_LE / MiniHack and ProcGen. At current rate, I believe we'll see an in-context NetHack agent soon that can in-context imitate from expert play.
Tweet media one
1
8
46
@robertarail
Roberta Raileanu
1 year
🚨 New Paper 🚨. 📜 We study the effects of RLHF on the generalization and diversity of LLMs. - RLHF results in much better OOD generalization than SFT . - RLHF leads to lower output diversity compared to SFT . kudos to @_robertkirk who did an outstanding job leading this work!👇.
@_robertkirk
Robert Kirk
1 year
Excited to share work from my FAIR internship on understanding the effects of RLHF on LLM generalisation and diversity:. While RLHF outperforms SFT in-distribution and OOD, this comes at the cost of a big drop in output diversity!. Read more below🧵
Tweet media one
0
8
47
@robertarail
Roberta Raileanu
1 year
🚨 Wondering when, where and how to improve your LLM’s reasoning?. 🤖 Look no further! Our new method GLoRe shows how you can do just that. ⏲️ When? Use an outcome-based reward model (ORM) to decide which solutions need refinement. 🎯 Where? Use a step-wise ORM (SORM) to decide.
@Dahoas1
Alex Havrilla
1 year
New paper alert🚨🚨🚨.How to bootstrap the reasoning refinement capabilities of LLMs using synthetic data? Introducing "GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements". Applied on GSM8K we can improve a strong RL finetuned LLama-2 13B by 12%
Tweet media one
0
11
46
@robertarail
Roberta Raileanu
2 years
📢 Check out our new survey on current challenges and applications of LLMs!. There is still plenty to understand, improve, and build, so we hope this survey will lower the barrier for newcomers, researchers, and practitioners to help advance the field.
@robert_mchardy
Robert McHardy 🏖️
2 years
(1) LLMs are ubiquitous, but what are the challenges requiring further research and in where have LLMs successfully been applied?. In our new paper, Challenges and Applications of Large Language Models, we answer these questions comprehensively. 📚:
Tweet media one
1
7
45
@robertarail
Roberta Raileanu
5 years
NetHack is a great environment for driving research on exploration, planning, transfer, generalization, and many other key RL problems! Very curious to see what insights it will inspire.
@egrefen
Edward Grefenstette
5 years
Want to help push the boundaries of RL research? Need a rich, difficult, and procedurally-generated environment with loads of structure and intricacy? An astounding amount of human play data? Sophisticated strategies and documentation? We got you (and it's faster than ALE!) [1/6]
Tweet media one
1
5
42
@robertarail
Roberta Raileanu
2 years
Want agents that effectively explore varying environments with high-dimensional observations or huge state spaces? . Check out E3B: Exploration via Elliptical Episodic Bonuses, which achieves a new SOTA on Habitat and MiniHack. Fantastic work led by @MikaelHenaff!.
@HenaffMikael
Mikael Henaff
2 years
Excited to share our @NeurIPSConf paper where we propose E3B--a new algorithm for exploration in varying environments. Paper: Website: E3B sets new SOTA for both MiniHack and reward-free exploration on Habitat. A thread [1/N]
0
7
44
@robertarail
Roberta Raileanu
1 year
In our paper led by @ishitamed ( to be presented #ICLR2024), we found that online RL generalizes much better than offline learning methods (including decision transformers, behavioral cloning, and offline RL), on both Procgen an WebShop. But we were.
@GhugareRaj
Raj Ghugare
1 year
What if I told you that there were a combinatorial number of tasks that Decision Transformer like methods overlook? Excited to share our paper which unwraps this by linking trajectory stitching to combinatorial generalization, leading to a practical data augmentation method.
1
6
43
@robertarail
Roberta Raileanu
4 years
We'll be presenting our work on solving hard exploration tasks via quasi-adversarial intrinsic goals at #ICLR2021 during poster session 2 on Monday. Led by Andres Campero, joint with @HeinrichKuttler, Josh Tenenbaum, @_rockt, and @egrefen.
@egrefen
Edward Grefenstette
4 years
“Going” to ICLR on Monday? Be sure to “stop by” our poster during session 2. We present a new near-adversarial metalearning-inspired method for structuring exploration in RL by modelling the boundary of the agent’s abilities. Easy to add to your agent!.
Tweet media one
1
7
41
@robertarail
Roberta Raileanu
6 months
Excited to talk about how we can build more general and robust agents at the @AutoRL_Workshop #ICML2024. I'll be discussing the advantages and limitations of using offline methods for learning sequential decision making tasks, such as in-context learning with transformers or.
@AutoRL_Workshop
AutoRL Workshop
7 months
Up next: Jack Parker-Holder @jparkerholder works on open-endedness as a research scientist at @GoogleDeepMind and honorary lecturer with @UCL_DARK . His focus is on unlimited training data for open-ended RL - being able to generate interactive tasks controllably and at will 🔮.
2
5
40
@robertarail
Roberta Raileanu
1 year
📢 @UCL_DARK has a few open PhD positions starting next year. If you’re interested in working with us on agents capable of robust decision making and open-ended learning consider applying!.
@UCL_DARK
UCL DARK
1 year
We (@_rockt, @egrefen, @robertarail, and @jparkerholder) are looking for PhD students to join us in Fall 2024. If you are interested in Open-Endedness, RL & Foundation Models, then apply here: and also write us at ucl-dark-admissions@googlegroups.com.
1
8
40
@robertarail
Roberta Raileanu
6 months
Thanks @tafm_rlc for organizing an excellent workshop at the @RL_Conference! . I deeply enjoyed the diversity of the posters and talks, debating the future of foundation models and agency with @criticalneuro and @MichaelD1729 (where we all seemed to agree that as a community we.
1
8
38
@robertarail
Roberta Raileanu
2 years
Had a great time speaking about learning behaviors from large-scale datasets #GameAISchool2022. More details about this work coming out soon.
@togelius
Julian Togelius
2 years
The fourth day of #GameAISchool2022 kicks off with @robertarail giving a very interesting talk about learning behavior from large-scale datasets, and in particular the @NetHack_LE. I think large-scale imitation learning will be very important to the future of games.
Tweet media one
0
4
36
@robertarail
Roberta Raileanu
1 year
We’re trying something different for the paper awards at the @RL_Conference. Instead of awarding papers for being the overall “best”, we award them for making significant contributions to specific aspects of research, aiming to promote more diverse contributions.
@MarlosCMachado
Marlos C. Machado
1 year
Roberta (@robertarail) and I came up with a different proposal on how to do paper awards. This is what @RL_Conference will do this year. The idea is to award papers for excelling in what they aim to accomplish. If you want to know more, take a look at the blog post we wrote.
0
1
36
@robertarail
Roberta Raileanu
1 year
If you work on generalization in sequential decision making, reinforcement learning, or planning, consider submitting your work at this #NeurIPS2023 workshop👇.
@pulkit_verma
Pulkit Verma
1 year
10 days left to submit your work to #NeurIPS2023 Workshop on Generalization in Planning. We have an exciting speaker lineup: .@FeryalMP, Giuseppe De Giacomo, Hector Geffner, @robertarail, @PeterStone_TX, @yayitsamyzhang . CFP and more info available at
Tweet media one
0
3
34
@robertarail
Roberta Raileanu
2 years
In our new paper led by @yidingjiang, we highlight the importance of exploration for generalization to new environments. Encouraging the agent to visit states with high epistemic uncertainty results in EDE, the first value-based method to achieve SOTA on Procgen and Crafter.
@yidingjiang
Yiding Jiang
2 years
An agent is unlikely to be tested in the exact same environment it’s trained in. In our new work with @zicokolter and @robertarail, we show that exploration during training plays a crucial role in zero-shot generalization to new environments. 🧵 1/.
Tweet media one
0
4
35
@robertarail
Roberta Raileanu
11 months
Thanks @arankomatsuzaki for highlighting our work!. These insights inspired us to develop GLoRe, a new method for refining LLM reasoning using synthetic data:
@arankomatsuzaki
Aran Komatsuzaki
11 months
Meta presents: Teaching Large Language Models to Reason with Reinforcement Learning. Finds the sample complexity of Expert Iteration is similar to that of PPO and performs the best among all algorithms.
Tweet media one
0
6
34
@robertarail
Roberta Raileanu
6 months
Had a blast presenting at the @AutoRL_Workshop #ICML2024, thanks to the organizers and attendees for all the insightful questions!.
@_rockt
Tim Rocktäschel
6 months
Fantastic keynote talk by @robertarail at the AutoRL workshop on achieving in-context learning for complex environments like @NetHack_LE / MiniHack and ProcGen. At current rate, I believe we'll see an in-context NetHack agent soon that can in-context imitate from expert play.
Tweet media one
2
2
33
@robertarail
Roberta Raileanu
2 months
Enjoy our early Christmas gift 🦙🦙🦙. Great to see small open-source models catching up with large proprietary ones 🚀🚀🚀.
@Ahmad_Al_Dahle
Ahmad Al-Dahle
2 months
Introducing Llama 3.3 – a new 70B model that delivers the performance of our 405B model but is easier & more cost-efficient to run. By leveraging the latest advancements in post-training techniques including online preference optimization, this model improves core performance at
Tweet media one
1
2
34
@robertarail
Roberta Raileanu
3 years
Had a great time chatting about generalization in RL with Wendelin Böhmer, Cheng Zhang, Harm van Seijen, and Mingfei Sun at the Microsoft Research Summit! Thanks @MSFTResearch for organizing this! You can watch the discussion in the RL - Part 2 track:
0
5
33
@robertarail
Roberta Raileanu
2 years
Check out our new ICLR paper on MAESTRO led by @_samvelyan. MAESTRO trains robust agents for zero-sum two-player games by generating joint auto-curricula over both environments and co-players.
@_samvelyan
Mikayel Samvelyan
2 years
I’m excited to share our latest #ICLR2023 paper . 🏎️ MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning 🏎. Paper: .Website: . Highlights: 👇
0
5
34
@robertarail
Roberta Raileanu
3 years
Interested in AI and Games and wouldn’t mind spending a week in Greece learning about it? Come join us then!.
@GameAISchool
AI and Games School 2025
3 years
One week left of EarlyBird! #GameAISchool Aug29-Sept2 organized by @modl_ai with industry experts from @MSFTResearch @seed @AskBLU_ai @MetaAI @GoogleAI @UbiMassive @Sony and more!.@DominiqueBusso @GalaxyKate @smdvln @l_gisslen @robertarail @RenaudieDavid .
0
4
33
@robertarail
Roberta Raileanu
9 months
Had a blast debating the future of LLM agents with such legends in the field and hearing everyone’s insights on this timely topic! . Thanks @llmagentsworkshop for the invite and for putting together an excellent workshop 🙏.
@LLMAgents
LLM Agents Workshop@ICLR2024
9 months
Our panel session is starting!
Tweet media one
1
1
33
@robertarail
Roberta Raileanu
11 months
Training action-conditioned video generation models is a great feat. Even more impressive to do this fully unsupervised. This paves the way for so many exciting research directions and applications. Can't wait to see such neural simulators being used to train robots or other.
@_rockt
Tim Rocktäschel
11 months
I am really excited to reveal what @GoogleDeepMind's Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation world model trained exclusively from Internet videos that can generate an endless variety of action-controllable 2D worlds given image prompts.
1
4
31
@robertarail
Roberta Raileanu
3 years
Excited to talk about generalization in RL @ICComputing this week!.
@ic_arl
ICARL
3 years
Next seminar coming! 🥳🥳26th of May at 17:00 (UK) we are exited to have @robertarail 🤗from @MetaAI as our next invited speaker to tell us about her latest research on generalisation in deep reinforcement learning @imperialcollege Huxley 308 @ICComputing.
1
4
32
@robertarail
Roberta Raileanu
9 months
Inspiring talk from @katjahofmann on how we can empower game creators with learned action and world models! The video game generation results look very impressive! At the GenAI for Decision Making workshop #ICLR2024
Tweet media one
0
3
32
@robertarail
Roberta Raileanu
9 months
🚨 DreamCraft 🚨. Ever wanted to generate MineCraft ⛏️💎🏡 structures by simply describing them in natural language? . Our new method DreamCraft does just this using quantized NeRFs 🔥 . DreamCraft can also be combined with functional constraints to ensure the generated 3D.
@Smearle_RH
smearle
9 months
We introduce DreamCraft 🗣️🧱⛏️, a NeRF-based method for generating MineCraft structures from free-form text prompts. Work with @filippos_kok, @NieYuhe, @togelius and @robertarail at FDG 2024. 📜 E.g.,. "large medieval ship". "old wizard's tree mansion"
Tweet media one
Tweet media two
0
4
30
@robertarail
Roberta Raileanu
2 years
Nice idea on using LLMs to inject human priors into the exploration of RL agents. This biases exploration towards behaviors and tasks that humans care about. Could be particularly useful for open-ended environments with large state spaces like NetHack or MineCraft.
@d_yuqing
Yuqing Du
2 years
How can we encourage RL agents to explore human-meaningful behaviors *without* a human in the loop?. @OliviaGWatkins2 and I are excited to share “Guiding Pretraining in Reinforcement Learning with LLMs”! . 📜🧵1/
0
2
31
@robertarail
Roberta Raileanu
1 year
Curious how RLHF and SFT compare in terms of OOD generalization and diversity?. Check out @_robertkirk's talk at @itif_workshop workshop at 4pm and our poster at 1pm - 2pm! Room 220 - 222 #NeurIPS2023.
@_robertkirk
Robert Kirk
1 year
Excited to share work from my FAIR internship on understanding the effects of RLHF on LLM generalisation and diversity:. While RLHF outperforms SFT in-distribution and OOD, this comes at the cost of a big drop in output diversity!. Read more below🧵
Tweet media one
1
5
31
@robertarail
Roberta Raileanu
9 months
Rainbow Teaming poster getting crowded. Come hear @sharathraparthy and @_andreilupu explaining how we can use open-ended learning methods like quality-diversity to generate endless adversarial prompts for LLMs and make them more robust in the process 🌈🌈🌈
Tweet media one
@sharathraparthy
Sharath Raparthy
9 months
@_andreilupu @robertarail and I are presenting Rainbow Teaming 🌈 at Set LLM workshop. Location - Schubert 5. Come chat with us!.
2
5
31
@robertarail
Roberta Raileanu
2 months
If you’re at #NeurIPS2024 and interested in open-endedness, self-improvement, or LLM safety and robustness, check out our Rainbow Teaming poster, presented today by @_samvelyan and @_andreilupu 🌈🌈🌈.
@_samvelyan
Mikayel Samvelyan
2 months
Presenting our 🌈 Rainbow Teaming paper today at #NeurIPS2024 with @_andreilupu. 📅 December 11.🕚 11 am — 2 pm.📍 East Exhibit Hall A-C, Poster #1906. Stop by to learn how open-endedness can enhance LLM safety—or to see the most colorful poster in town!
Tweet media one
1
2
30
@robertarail
Roberta Raileanu
6 months
Open-endedness is all you need! . Great read from @chalmermagne and @nathanbenaich at @airstreetpress outlining the challenges in building truly useful AI agents that can deal with the conplexity and unpredictability of the real-world. They highlight open-ended learning as an.
@nathanbenaich
Nathan Benaich
6 months
New on @airstreetpress: Now that LLMs can convincingly automate much of a bored human’s tasks, attention is turning to “agentic AI”. In this piece, we evaluate how far advanced this work actually is, look at both promising research directions and the challenges ahead. Thread:
Tweet media one
0
8
30
@robertarail
Roberta Raileanu
2 years
Exploration, and more broadly active data collection, is key for generalization, not only in RL but also SL and SSL. We are just scratching the surface on understanding how the two interact.
@MinqiJiang
Minqi Jiang
2 years
The popular focus on big models in deep learning ignores the core assumption of having the right data—a product of *exploration.* @egrefen, @rockt_, and I outline how + why exploration is key to creating more general AIs. Blog: 📜
0
3
29
@robertarail
Roberta Raileanu
6 months
Very inspiring work and super cool to see open-ended learning ideas being applied to scientific discovery, which is, of course, an open-ended process! . Congrats to everyone involved!.
@SakanaAILabs
Sakana AI
6 months
Introducing The AI Scientist: The world’s first AI system for automating scientific research and open-ended discovery!. From ideation, writing code, running experiments and summarizing results, to writing entire papers and conducting peer-review, The AI
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
0
28
@robertarail
Roberta Raileanu
2 years
In our new #ICML oral, we find that episodic bonuses are best at exploring environments with little shared structure across episodes, while global bonuses are best for a lot of shared structure. Combining them achieves the best of both worlds and works well in multiple domains.
@HenaffMikael
Mikael Henaff
2 years
Exploration is well-studied for singleton MDPs, but many envs of interest change across episodes (i.e. procgen envs or embodied AI tasks). How should we explore in this case?. In our upcoming @icmlconf oral, we study this question. A thread. 1/N.
0
4
28
@robertarail
Roberta Raileanu
4 months
Great to see evidence that RL helps code agents learn over multiple steps of code execution and feedback. Results look impressive, congrats!.
@jnsgehring
Jonas Gehring
4 months
LLMs for code should do much better if they can iterate on tests -- but they don't. Our new work (RLEF) addresses this with execution feedback at RL *training time* to use execution feedback at *inference time*. is just out! 1/6
Tweet media one
0
3
27
@robertarail
Roberta Raileanu
2 months
Great to see more work on open-endedness for AI safety! This paper on automatic evaluation and capability discovery in frontier models looks very neat and practical, kudos to the authors @cong_ml @shengranhu @jeffclune. Humbled to hear Rainbow Teaming was a source of.
@jeffclune
Jeff Clune
2 months
I am very excited about this work using open-endedness methods for AI safety, which I expect will improve, speed up, and make cheaper pre-deployment assessments of frontier models, by automatically detecting unanticipated surprising new capabilities and weaknesses. Inspired by.
0
6
26
@robertarail
Roberta Raileanu
2 years
If you are interested in open-ended learning or self-improving models, check out the ALOE workshop at NeurIPS 2023!.
@aloeworkshop
ALOE Workshop
2 years
🌱 The 2nd Agent Learning in Open-Endedness Workshop will be held at NeurIPS 2023 (Dec 10–16) in magnificent New Orleans. ⚜️. If your research considers learning in open-ended settings, consider submitting your work (by 11:59 PM Sept. 29th, AoE).
0
3
27
@robertarail
Roberta Raileanu
1 year
Such a cool paper!. Shows how to use open-ended search to find diverse adversarial scenarios for multi-agent systems. With applications to football!.
@_samvelyan
Mikayel Samvelyan
1 year
Uncovering vulnerabilities in multi-agent systems with the power of Open-Endedness! . Introducing MADRID: Multi-Agent Diagnostics for Robustness via Illuminated Diversity ⚽️. Paper: Site: Code: 🔜. Here's what it's all about: 🧵👇
Tweet media one
0
5
26
@robertarail
Roberta Raileanu
1 year
Stay tuned!.
@MarlosCMachado
Marlos C. Machado
1 year
I'm very excited with what @robertarail and I are cooking up!.
0
0
26
@robertarail
Roberta Raileanu
1 year
Great to see this!.
@yayitsamyzhang
Amy Zhang
1 year
Thrilled to announce the first annual Reinforcement Learning Conference @RL_Conference, which will be held at UMass Amherst August 9-12! RLC is the first strongly peer-reviewed RL venue with proceedings, and our call for papers is now available:
Tweet media one
0
2
25
@robertarail
Roberta Raileanu
6 months
I won’t be attending #ICML2024 in person this year, but many @UCL_DARK members will be there so make sure to check out their excellent work, including a number of orals and spotlights 🙀, as well as some of my own contributions: .- GLoRe: When, Where, and How to Improve LLM.
@UCL_DARK
UCL DARK
7 months
DARK is going to Vienna🏰🎶! We are excited to present our work at #icml2024, including 3 Orals on; Debate for Scalable Oversight (, Genie (, and a position paper on Open-Endedness (. Come chat with us ⬇️🚀
Tweet media one
0
4
25
@robertarail
Roberta Raileanu
1 year
Come chat with us about the importance of exploration for generalization in RL this afternoon #NeurIPS2023! with @yidingjiang and @zicokolter.
@robertarail
Roberta Raileanu
1 year
2) "On the Importance of Exploration for Generalization in Reinforcement Learning" led by @yidingjiang .
Tweet media one
0
1
25
@robertarail
Roberta Raileanu
6 months
Proud to work at a company which is such a great champion of open source!.
@Ahmad_Al_Dahle
Ahmad Al-Dahle
6 months
With today’s launch of our Llama 3.1 collection of models we’re making history with the largest and most capable open source AI model ever released. 128K context length, multilingual support, and new safety tools. Download 405B and our improved 8B & 70B here.
2
0
23
@robertarail
Roberta Raileanu
2 years
2 new papers on using LLMs to generate game levels! Very cool work. Language seems like a powerful way of generating open-ended tasks for training increasingly more capable and general agents. It’s also a neat way of incorporating human priors, which are currently lacking.
@togelius
Julian Togelius
2 years
Large language levels can do. lots of things. But can they generate game levels? Playable game levels, where puzzles are solvable? Two papers announced today address this question. The first is by our team at @NYUGameLab - read on for more:
Tweet media one
0
1
24
@robertarail
Roberta Raileanu
2 years
📢 Do you work on LLM safety, alignment, fairness, accountability, or transparency?. 📜 Submit your work to the #NeurIPS2023 SoLaR workshop on Socially Responsible Language Modelling Research!.
@solarneurips
SoLaR @ NeurIPS2024
2 years
We are announcing the NeurIPS 2023 workshop on Socially Responsible Language Modelling Research (SoLaR)! We welcome submissions from all disciplines promoting fairness, accountability, transparency, and safety in language modeling.
0
3
23
@robertarail
Roberta Raileanu
6 months
Don’t miss out @HenaffMikael presenting our work today on in-context learning of sequential decision making tasks! #ICML2024.
@sharathraparthy
Sharath Raparthy
6 months
@HenaffMikael will be presenting our work “Generalization to New Sequential Decision Making Tasks with In-context Learning” . Location - Hall C4-9 #2915. #ICML2024.
1
2
23
@robertarail
Roberta Raileanu
8 months
Great work on using LLMs + evolutionary methods to discover new and well-performing ML algorithms! Feels like we're just scratching the surface on what's possible 🚀.
@_chris_lu_
Chris Lu
8 months
Excited to share my first work from my internship @SakanaAILabs!. We used LLMs to design and implement new preference optimization algorithms for training LLMs, discovering cutting-edge methods!. Co-led with @samianholt and Claudio Fanconi. Details in thread 🧵 (1/N).
0
1
23
@robertarail
Roberta Raileanu
1 year
📢Chain-of-Verification (CoVe) reduced hallucinations in LLMs by self-verifying its own answers via simple follow-up questions. Great work from @shehzaadzd during his FAIR internship!.
@shehzaadzd
Shehzaad Dhuliawala
1 year
Chain-of-Verification Reduces Hallucination in LLMs.(work done during my internship at FAIR). - LLMs more likely to hallucinate during longform generation.- We show that generating verification questions for the LLM to self-answer allows the model to deliberate on its outputs 🧵.
0
1
23
@robertarail
Roberta Raileanu
2 years
Great work on improving sample efficiency in RL via resets, which allows more updates and thus better scaling with compute. Love the simplicity of the method and the many insights in the paper.
@proceduralia
Pierluca D'Oro
2 years
Our paper "Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier" has been accepted to ICLR as top 5%! . We show that lots of updates and resets lead to great sample efficiency. Come for the scaling, stay for the insights. .1/N🧵
Tweet media one
0
3
23
@robertarail
Roberta Raileanu
5 years
A new challenge for RL agents based on NetHack! Grateful to have been among the first few people to try it out for research.
@_rockt
Tim Rocktäschel
5 years
I am proud to announce the release of the NetHack Learning Environment (NLE)! NetHack is an extremely difficult procedurally-generated grid-world dungeon-crawl game that strikes a great balance between complexity and speed for single-agent reinforcement learning research. 1/
0
3
21
@robertarail
Roberta Raileanu
5 years
Check out our poster session at #ICML2020 if you want to learn more about a new framework for fast adaptation in RL using policy-dynamics value functions. Today (Wed) at 8pm UTC and tomorrow at 9am UTC. ICML talk: Paper:
0
4
22
@robertarail
Roberta Raileanu
10 months
Great to see our survey on the Challenges and Applications of LLMs being cited by the White House.
@robert_mchardy
Robert McHardy 🏖️
10 months
Very cool to see our work on LLMs being cited by the 2024 Economic Report of the President (p.256 & 371):. @jeankaddour @maximilianmozes @herbiebradley @robertarail.
0
2
21
@robertarail
Roberta Raileanu
11 months
🌈🌈🌈 Rainbow Teaming 🌈🌈🌈. Beyond excited to share Rainbow Teaming, our new method for open-ended generation of diverse adversarial prompts for LLMs. 🌟 Rainbow Teaming is a generic approach for finding prompts where the model fails according to some metric. For some domains.
@_samvelyan
Mikayel Samvelyan
11 months
Introducing 🌈 Rainbow Teaming, a new method for generating diverse adversarial prompts for LLMs via LLMs. It's a versatile tool 🛠️ for diagnosing model vulnerabilities across domains and creating data to enhance robustness & safety 🦺. Co-lead w/ @sharathraparthy & @_andreilupu
1
4
21
@robertarail
Roberta Raileanu
2 years
Not at ICML this year but thankfully my collaborators are, so reach out :). @The_Eimer is presenting.Hyperparameters in RL and How to Tune Them. @HenaffMikael is presenting.A Study of Global and Episodic Bonuses in Contextual MDPs.
@The_Eimer
Theresa Eimer
2 years
The timeslot for the poster is Thursday 1:30pm in exhibit hall 1, poster 414 - see you then 👋.
0
4
19
@robertarail
Roberta Raileanu
9 months
Can LLMs act as embodied agents 🤖 that learn from feedback and interaction with the world?. Can diffusion models be used as world models to enhance planning, RL, or control 🕹️ algorithms?. Can LLMs improve exploration in open-ended environments 🌎 by acting as human priors?.
@bogdan_mazoure
Bogdan Mazoure
9 months
🚨 Pass by our #ICLR2024 workshop on Generative AI for Decision Making tomorrow, Saturday May 11! 🚨. We have a content-heavy day, including an exciting lineup of invited and contributed talks, as well as two poster sessions!. Details:
0
4
21
@robertarail
Roberta Raileanu
1 year
🚀🚀🚀AstroLLaMA is out, check it out!!. Fantastic work from the super talented folks @universe_tbd. Can’t wait to see what new insights and discoveries this will enable, right at the intersection of some of my favorite topics. 🔭🪐 🤖.
@universe_tbd
UniverseTBD
1 year
We are thrilled to announce the release of AstroLLaMa, a LLM fine-tuned for astronomy ✨ it’s just a start of our journey towards harnessing the power of ML for the needs of researchers 🚀 #lettheastrollamaout .
Tweet media one
0
0
20
@robertarail
Roberta Raileanu
1 year
Working at the intersection of GenAI and Decision Making?. Submit your papers to our #ICLR2024 workshop!. Deadline extended to February 9, AOE. Follow @genai4dm for official updates from the organizers.
@genai4dm
GenAI4DM Workshop
1 year
🚨 Missed the ICML deadline? . Consider submitting a short (4 pages) or long (9 pages) paper to the GenAI+Decision Making workshop at ICLR 2024: Deadline is extended to February 9, AOE.
0
5
20
@robertarail
Roberta Raileanu
9 months
Great to see the OpenDevin effort! presented by @gneubig at the @LLMAgents workshop at #ICLR2024
Tweet media one
0
0
20
@robertarail
Roberta Raileanu
4 years
Cool work on prioritized replay for procedurally generated environments!.
@MinqiJiang
Minqi Jiang
4 years
Excited to share Prioritized Level Replay: a simple + effective way to improve generalization and sample efficiency of deep RL agents in procedurally-generated environments, where layout and dynamics can differ per episode. We call these variations "levels."
Tweet media one
1
9
18
@robertarail
Roberta Raileanu
6 months
A very special moment @RL_Conference.
@pcastr
Pablo Samuel Castro
6 months
Third keynote by Andy Barto @RL_Conference , arguing that it was always RL, with a standing ovation at the end!
Tweet media one
0
2
19
@robertarail
Roberta Raileanu
4 months
Movie Gen is super impressive, congrats @imisra_ and team!! Amazing to be able to read all the technical details that went into making this model.
@imisra_
Ishan Misra
4 months
So, this is what we were up to for a while :).Building SOTA foundation models for media -- text-to-video, video editing, personalized videos, video-to-audio. One of the most exciting projects I got to tech lead at my time in Meta!.
0
0
19
@robertarail
Roberta Raileanu
1 year
Great to see this new benchmark for general assistants that requires multi-step reasoning, planning, and tool use, posing a challenge to current LLMs. Looking forward to see what solves this, whether it’s scaling, a new algorithm, or Q* (whatever Q* is).
@mialon_gregoire
Grégoire Mialon
1 year
Happy to share GAIA with the community! 1/4. Joint work with @clefourrier @TechySwift @Thom_Wolf @ylecun @ThomasScialom.
0
1
19
@robertarail
Roberta Raileanu
2 years
Today we’re presenting NLD, a large-scale dataset of NetHack demonstrations. #NeurIPS2022.
@UCL_DARK
UCL DARK
2 years
2⃣ Dungeons and Data: A Large-Scale NetHack Dataset 🧙‍♂️🧝‍♀️. ⏰: 4:00PM .🚩: Hall J #1028 . Say 👋 to @erichammy @robertarail @_rockt @HeinrichKuttler.
0
4
18
@robertarail
Roberta Raileanu
2 years
Very neat work on hierarchical RL that will be presented #NeurIPS2022. Key idea: use compression to decompose demonstrations into generalizable skills.
@yidingjiang
Yiding Jiang
2 years
Hierarchical RL aims to break complex tasks into simpler subtasks, but how do we judge the quality of decomposition w/ minimal prior knowledge? In this new work at NeurIPS (, we show that good decompositions arise from compression. 🧵 1/
Tweet media one
0
0
18
@robertarail
Roberta Raileanu
5 years
RIDE achieves SOTA on challenging procedurally-generated MiniGrid tasks. In contrast to other exploration methods, RIDE does not suffer from diminishing intrinsic rewards during training and encourages agents substantially more to interact with objects they can control. 5/5
Tweet media one
Tweet media two
1
2
16
@robertarail
Roberta Raileanu
1 year
Excited to give a talk later today at the GenPlan workshop #NeurIPS2023 on In-Context Learning for Sequential Decision-Making Tasks. Thanks to the organizers for putting together this exciting workshop!. 📍Room 238-239.⏰4:00 - 4:35 PM.
1
5
18
@robertarail
Roberta Raileanu
8 months
Exciting to see more advances in open-ended learning!.
@jeffclune
Jeff Clune
8 months
I am thrilled to introduce OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code. Led by @maxencefaldor and @jennyzhangzt, with @CULLYAntoine and myself. 🧵👇
0
5
17