Cong Lu Profile Banner
Cong Lu Profile
Cong Lu

@cong_ml

Followers
1,171
Following
978
Media
32
Statuses
241

Postdoctoral Research Fellow @UBC_CS , in open-ended RL, and AI for Scientific Discovery. Prev: PhD @UniofOxford , RS Intern @Waymo , @MSFTResearch !

Vancouver, British Columbia
Joined October 2019
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@cong_ml
Cong Lu
9 days
It’s been a dream of mine since I started in ML to see autonomous agents conduct research independently and discover novel ideas! 💡 Today we take a large step towards making this a reality. We introduce *The AI Scientist*, led together with @_chris_lu_ and @RobertTLange . [1/N]
@SakanaAILabs
Sakana AI
9 days
Introducing The AI Scientist: The world’s first AI system for automating scientific research and open-ended discovery! From ideation, writing code, running experiments and summarizing results, to writing entire papers and conducting peer-review, The AI
Tweet media one
Tweet media two
Tweet media three
Tweet media four
249
1K
5K
8
10
79
@cong_ml
Cong Lu
1 year
RL agents🤖need a lot of data, which they usually need to gather themselves. But does that data need to be real? Enter *Synthetic Experience Replay*, leveraging recent advances in #GenerativeAI in order to vastly upsample⬆️ an agent’s training data! [1/N]
5
37
184
@cong_ml
Cong Lu
6 months
🚨 Model-based methods for offline RL aren’t working for the reasons you think! 🚨 In our new work, led by @anyaasims , we uncover a hidden “edge-of-reach” pathology which we show is the actual reason why offline MBRL methods work or fail! Let's dive in! 🧵 [1/N]
Tweet media one
3
17
107
@cong_ml
Cong Lu
3 months
I am extremely excited to share Intelligent Go-Explore, presenting a robust exploration framework for foundation model agents! 🤖 It was a delight to work with @shengranhu and @jeffclune on this! 📜 Paper: 🌐 Website and Code:
@jeffclune
Jeff Clune
3 months
Excited to introduce Intelligent Go-Explore: Foundation model (FM) agents have the potential to be invaluable, but struggle to learn hard-exploration tasks! Our new algorithm drastically improves their exploration abilities via Go-Explore + FM intelligence. Led by @cong_ml 🧵1/
5
47
253
1
17
73
@cong_ml
Cong Lu
9 months
Super excited to share that I'll be starting as a postdoc @UBC_CS with @jeffclune this January working on advancing open-endedness with large language/multimodal models and deep RL! 🤩 I'll be at NeurIPS next week and would love to discuss on any of those topics, I'll also be...
5
4
53
@cong_ml
Cong Lu
8 months
Cya #NeurIPS2023 !! It’s been a blast!
Tweet media one
0
0
47
@cong_ml
Cong Lu
2 years
Delighted that our paper won the *Outstanding Paper Award* at #LDOD at #RSS2022 !🥳 Thanks to the organizers for an amazing event! Paper + Code + Data: Joint with my amazing collaborators🥰: @philipjohnball @timrudner @jparkerholder @maosbot @yeewhye
Tweet media one
2
11
47
@cong_ml
Cong Lu
2 years
Offline RL offers tremendous potential for training agents from large pre-collected datasets. However, the majority of work focuses on the proprioceptive setting. In this work we release the first public benchmark for continuous control using *visual observations*, V-D4RL. [1/N]
Tweet media one
1
6
40
@cong_ml
Cong Lu
1 year
We've now released code for this project at ! We think the potential of synthetic data for sample efficiency and robustness is huge and can't wait to see what people do with it! In other news, we've extended the paper with pixel-based experiments... [1/2]
@cong_ml
Cong Lu
1 year
RL agents🤖need a lot of data, which they usually need to gather themselves. But does that data need to be real? Enter *Synthetic Experience Replay*, leveraging recent advances in #GenerativeAI in order to vastly upsample⬆️ an agent’s training data! [1/N]
5
37
184
1
8
39
@cong_ml
Cong Lu
9 days
I love this visualisation of where we were at the start of the project (no latex, only markdown, only a few experiments). Our current version of The AI Scientist is the worst it will ever be. 🚀🚀🚀
@_chris_lu_
Chris Lu
9 days
Although The AI Scientist still makes basic mistakes, its performance will only improve from here. We see this work as being similar to early developments in GenAI, where basic mistakes in image generation were quickly overcome. Our initial manuscripts looked like this:
Tweet media one
1
1
13
0
2
33
@cong_ml
Cong Lu
10 months
Thank you so much - I was so incredibly fortunate to have spent these years in Oxford under your and @maosbot 's supervision, and will always treasure the fun discussions and lessons learned throughout!
@yeewhye
Yee Whye Teh
10 months
Congratulations @cong_ml on defending his DPhil dissertation! Excellent work through out the past few years, and thanks to examiners @j_foerst @_rockt !
1
4
30
7
2
31
@cong_ml
Cong Lu
1 year
Delighted that V-D4RL has been accepted at TMLR! Our benchmark and algorithms are the perfect way to start studying offline RL from pixels. As performance in proprioceptive envs saturate, it’s increasingly necessary to look further! 🧐 Here are some notable uses so far… [1/N]
Tweet media one
1
4
30
@cong_ml
Cong Lu
11 months
Delighted that this piece of work was accepted to #NeurIPS2023 ! Excited to chat about it in New Orleans ✈️✈️
@cong_ml
Cong Lu
1 year
RL agents🤖need a lot of data, which they usually need to gather themselves. But does that data need to be real? Enter *Synthetic Experience Replay*, leveraging recent advances in #GenerativeAI in order to vastly upsample⬆️ an agent’s training data! [1/N]
5
37
184
1
1
30
@cong_ml
Cong Lu
8 months
Come catch us at poster #1409 now!
Tweet media one
1
2
29
@cong_ml
Cong Lu
1 year
Will be presenting our spotlight at Reincarnating RL @iclr_conf on generating synthetic data for RL with diffusion models at 10:40AM tomorrow! If you can't make it, here's the pre-recorded talk: Paper: #ICLR2023 #GenerativeAI
0
6
27
@cong_ml
Cong Lu
4 months
Super excited to share our new work led by @JacksonMattT showing that policy guidance + trajectory diffusion models produce extremely strong RL training data! 💥💥 As an added bonus, our code comes with JAX implementations of offline RL algorithms and diffusion upsampling! 🚀
@JacksonMattT
Matthew Jackson
4 months
🎮 Introducing the new and improved Policy-Guided Diffusion! Vastly more accurate trajectory generation than autoregressive models, with strong gains in offline RL performance! Plus a ton of new theory and results since our NeurIPS workshop paper... Check it out ⤵️
6
100
542
0
2
22
@cong_ml
Cong Lu
2 years
No better time to start on offline RL from pixels! V-D4RL is now on @huggingface at 💥 New D4RL-style visual datasets! 💥 Competitive baselines based on Dreamer and DrQ! 💥 A set of exciting open problems! Thanks @Thom_Wolf for the idea 😻
@cong_ml
Cong Lu
2 years
Delighted that our paper won the *Outstanding Paper Award* at #LDOD at #RSS2022 !🥳 Thanks to the organizers for an amazing event! Paper + Code + Data: Joint with my amazing collaborators🥰: @philipjohnball @timrudner @jparkerholder @maosbot @yeewhye
Tweet media one
2
11
47
0
6
21
@cong_ml
Cong Lu
2 years
If you’re attending #AIIDE22 and are interested in efficient, scalable, and simple-to-implement game testing, come to the #EXAG2022 workshop where I’ll be presenting our paper on Go-Explore for automated reachability testing at 11:20AM PDT! Paper📜:
Tweet media one
1
2
22
@cong_ml
Cong Lu
8 months
Come chat to us at the #NeurIPS2023 Robot Learning Workshop in Hall B2 about policy-guided diffusion! Super exciting work showing that guided diffusion enables long-sequence on-policy synthetic data for training agents! 🚀🚀
Tweet media one
@JacksonMattT
Matthew Jackson
8 months
Come check out a sneak peek of our work **Policy-Guided Diffusion** today at the NeurIPS Workshop on Robot Learning! Using offline data, we generate entire trajectories that are: ✅ On-policy, ✅ Without compounding error, ✅ Without model pessimism!
Tweet media one
4
25
131
0
3
18
@cong_ml
Cong Lu
2 years
Come chat to us now at the #icml DARL workshop about simple baselines and the first public benchmark for offline continuous control from pixels! In Hall G 🥳
Tweet media one
0
0
19
@cong_ml
Cong Lu
3 months
So cool to see people building agents with Intelligent Go-Explore already!! 🚀🚀🚀
@AtlantisPleb
Christopher David ⚡️
3 months
Magency: my project for the @craft_ventures agents hackathon 🤖👇 There is no good way to control AI agents on a mobile phone. There should be an app for that! Magency is a mobile app that lets you "make a wish", aka just say what you want into one text box and see a feed of
1
9
29
0
2
18
@cong_ml
Cong Lu
20 days
Delighted that research from our lab was featured in Science News! Great read about harnessing large language models to create open-ended learning systems!
@jeffclune
Jeff Clune
20 days
OMNI-EPIC & Intelligent Go-Explore in Science News! "Both works are significant advancements towards creating open-ended learning systems,” -Tim Rocktäschel.Lead by @jennyzhangzt @maxencefaldor & @conglu Quotes @j_foerst & @togelius too.Thx @SilverJacket !
2
14
54
0
1
18
@cong_ml
Cong Lu
3 years
Really excited about this recent work to feature in #ICML2021 on meta-learning task exploration in agent belief space!
@whi_rl
WhiRL
3 years
Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning - @luisa_zintgraf , @lylbfeng , @cong_ml , @MaxiIgl , @kristianhartika , @katjahofmann , @shimon8282
Tweet media one
Tweet media two
1
2
24
0
1
15
@cong_ml
Cong Lu
2 months
In many realistic imitation learning settings, we often have differences in observations between experts and imitators. E.g. when experts have privileged information. Excited to share our new work towards a principled Bayesian solution to resolving such imitation gaps! 😍
@ristovuorio
Risto Vuorio
2 months
Demonstrating the desired behavior is often easier than defining a good reward function. However, when the demonstrator observes the world differently than the imitator, imitation learning can fail. ➡️In our new pre-print, we propose a Bayesian solution to this imitation gap.
1
4
11
0
2
15
@cong_ml
Cong Lu
2 years
Excited to share our ICLR spotlight on revisiting🤔 design choices in offline MBRL: later today! In the meantime, check out our light-hearted video introducing the paper: 😀 @philipjohnball @jparkerholder @maosbot , Steve Roberts
@philipjohnball
Philip J. Ball
3 years
Model-based approaches have recently shown SOTA performance in the offline RL setting, typically by penalizing regions with dynamics uncertainty. But how well are current methods actually doing this? 1/
1
8
36
0
4
12
@cong_ml
Cong Lu
3 months
Extremely excited to share our new work led by @GunshiGupta and @KarmeshYadav showing that pretrained diffusion models provide powerful vision-language representations for control tasks that drive efficiency and generalization! All code open-sourced at:
@GunshiGupta
Gunshi Gupta
3 months
Excited to be giving a contributed oral talk tomorrow at the #GenAI4DM Workshop at #ICLR2024 about our latest work harnessing pre-trained diffusion models as vision-language representations learners that excel across a wide variety of control tasks! Details in 🧵below!
Tweet media one
2
8
60
0
3
12
@cong_ml
Cong Lu
3 years
Come chat to us @jparkerholder @philipjohnball at the #ICLR SSL-RL Workshop about generalisation to environments with changed dynamics from offline data on a single environment with Augmented World Models! Gathertown Link: Paper:
Tweet media one
0
2
12
@cong_ml
Cong Lu
1 year
We are excited about scaling this work to more settings! This is joint work with awesome co-authors: @philipjohnball , @jparkerholder 🥰. Come chat with us at the Reincarnating RL Workshop at @iclr_conf or get in touch! Paper: [7/N]
0
3
12
@cong_ml
Cong Lu
2 years
Super excited by our recent work massively expanding the scope of PBT methods in RL! 💥 Joint adaptation of architecture and hyperparameters 💥 Treating the *whole* RL hyperparameter space with trust-region BO 💥 Massive improvements on the prior PBT baselines, all code online!
@wanxingchen_
Xingchen Wan
2 years
(1/7) Population Based Training (PBT) has been shown to be highly effective for tuning hyperparameters (HPs) for deep RL. Now with the advent of massively parallel simulators, there has never been a better time to use these methods! However, PBT has a couple of key problems…
3
5
40
0
2
10
@cong_ml
Cong Lu
3 years
Really excited to share our recent work with @philipjohnball , @jparkerholder and Steve Roberts on dynamics generalisation from data from a single offline RL environment! To appear as a spotlight in the #ICLR2021 SSL-RL Workshop 😃
@jparkerholder
Jack Parker-Holder
3 years
The case for offline RL is clear: we often have access to real world data in settings where it is expensive (and potentially even dangerous) to collect new experience. But what happens if this offline data doesn’t perfectly match the test environment? [1/8]
1
14
85
0
0
12
@cong_ml
Cong Lu
3 years
Come chat to us at C0 about generalisation to new tasks from offline data on a single environment with AugWM! #ICML2021
@oxcsml
OxCSML
3 years
Spotlight presentation in Reinforcement Learning 5, Wed 21 Jul 02:00 BST — 03:00 BST (Tues 6 p.m. PDT) Poster Session 2: Wed 21 Jul 04:00 BST — 07:00 BST (Tues 8 p.m - 11 p.m. PDT) @philipjohnball , @cong_ml , @jparkerholder , Stephen Roberts #ICML2021
1
1
4
0
2
11
@cong_ml
Cong Lu
4 years
Excited to present some recent work with @timrudner , @maosbot and @yaringal at the #NeurIPS2020 BDL Meetup today at 12 & 5pm GMT! Join us at !
@timrudner
Tim G. J. Rudner
4 years
A Probabilistic Perspective on Pathologies in Behavioral Cloning for Reinforcement Learning with @cong_ml , @maosbot and @yaringal 4/5
Tweet media one
1
3
10
0
1
11
@cong_ml
Cong Lu
3 months
Come see us tomorrow at the contributed orals at the #GenAI4DM workshop, talking about leveraging text-to-image diffusion models as vision-language representation learners for control! #ICLR2024
@rl_agent
Lisa Lee
3 months
Also looking forward to the Contributed Oral Talks at #GenAI4DM workshop at #ICLR2024 : Do Transformer World Models Give Better Policy Gradients? Authors: @michel_ma_ @twni2016 Clement Gehring @proceduralia @pierrelux Pretrained Text-to-Image Diffusion
0
4
13
0
1
9
@cong_ml
Cong Lu
3 years
Come chat to us now @ D5, RL4RL workshop: about our new work revisiting uncertainty quantification in offline MBRL and showcasing new SOTA results on D4RL MuJoCo 😻 Paper: @philipjohnball @jparkerholder @maosbot , Stephen Roberts
0
4
9
@cong_ml
Cong Lu
8 days
Thanks for having me! Super fun discussion on The AI Scientist! ❤️
@iScienceLuvr
Tanishq Mathew Abraham, Ph.D.
8 days
WE ARE STARTING IN 15 MIN We have a great list of papers and guests! The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery - first author @cong_ml will be presenting! Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model
1
3
32
0
0
9
@cong_ml
Cong Lu
3 years
Check out our recent work revisiting design choices in offline model-based reinforcement learning! 🤔 arXiv: With fantastic collaborators: @philipjohnball @jparkerholder @maosbot Stephen Roberts!
@philipjohnball
Philip J. Ball
3 years
Model-based approaches have recently shown SOTA performance in the offline RL setting, typically by penalizing regions with dynamics uncertainty. But how well are current methods actually doing this? 1/
1
8
36
1
0
8
@cong_ml
Cong Lu
1 year
(cont.) with data generated in latent space! Our experiments show a solid performance gain out of the box on the standard *V-D4RL* datasets with lots more untapped potential! Check it out here: [2/2]
Tweet media one
0
0
7
@cong_ml
Cong Lu
9 days
Agreed, truly a phenomenal team to envision the future of scientific discovery with!!! 🥰🥰🥰
@RobertTLange
Robert Lange
9 days
Time to retire @SchmidhuberAI ? 📹: Jokes aside - this project has been soo much fun! @_chris_lu_ and @cong_ml made this one of the best colabs I had so far 🥰
Tweet media one
1
1
39
0
0
7
@cong_ml
Cong Lu
1 year
Our algorithm is conceptually simple and is compatible with *any RL algorithm* utilizing experience replay! Synthetic data generated by a diffusion model may simply be added to the replay buffer and trained on as if it was real experience. [2/N]
Tweet media one
1
2
7
@cong_ml
Cong Lu
10 months
I'm also extremely grateful to @j_foerst and @_rockt for putting me through my paces and an incredibly informative discussion! Thanks as well to all my collaborators and mentors who guided me along the way, as well as my friends and family! 🥰 Stay tuned for what's next!!
0
0
7
@cong_ml
Cong Lu
1 year
Agents trained with synthetic data perform as well as agents trained with much more real data! With *no algorithmic changes needed at all*, simple RL agents with synthetic data beat carefully designed data-efficient algorithms and prior data augmentations methods. [3/N]
Tweet media one
1
2
6
@cong_ml
Cong Lu
6 months
We are excited about what this unified perspective of offline RL might mean for future work! To find out more, please check out our paper: , And code: , Thanks again to my amazing co-authors @anyaasims @yeewhye ! 🥰🥰 [N/N]
0
0
6
@cong_ml
Cong Lu
2 years
We hope this work can springboard progress in this very nascent field! Work done with some awesome collaborators: @philipjohnball @timrudner @jparkerholder @maosbot @yeewhye . 🥳 [8/N]
1
0
5
@cong_ml
Cong Lu
9 months
... presenting our new work on efficiently training RL agents with synthetic generative data () at Poster Session 2 on Tuesday. Do come say hi! 👋
@cong_ml
Cong Lu
1 year
RL agents🤖need a lot of data, which they usually need to gather themselves. But does that data need to be real? Enter *Synthetic Experience Replay*, leveraging recent advances in #GenerativeAI in order to vastly upsample⬆️ an agent’s training data! [1/N]
5
37
184
0
1
2
@cong_ml
Cong Lu
6 months
These edge-of-reach states trigger catastrophic value overestimation and a complete collapse of learning! For example, this figure shows how the agent 🤖 completely ignores the reward function and instead just aims towards an arbitrary edge-of-reach state! [4/N]
Tweet media one
1
0
5
@cong_ml
Cong Lu
1 year
So why was this not possible before? It turns out that small differences in sample quality with VAEs and GANs significantly affect downstream RL performance. [5/N]
Tweet media one
2
2
5
@cong_ml
Cong Lu
3 months
@jsuarez5341 Agreed! Some way of selectively deferring to the FM for the “harder” parts of the env would drastically increase throughput!
1
0
4
@cong_ml
Cong Lu
9 days
We produce a vast archive of completed papers across both proprietary and open-weight LLMs, allowing us to for the first time get a sense of their ability to partake in the entire scientific process. [3/N]
Tweet media one
1
0
4
@cong_ml
Cong Lu
2 years
We further analyze challenges and opportunities unique to the pixel-based setting including data with visual distractions. We see that our algorithms are robust to visual distractions but only Offline DV2 generalizes to unseen distractions. Scope for future work here!! 🧐 [5/N]
Tweet media one
1
0
4
@cong_ml
Cong Lu
1 year
Remarkably, we find that the synthetic samples generated by our diffusion model are simultaneously *more diverse, more novel, and more accurate* to the true environment dynamics than the best data augmentation method.💥 [4/N]
Tweet media one
1
2
4
@cong_ml
Cong Lu
2 years
To kick off progress on new methods, we include strong baselines derived from the SoTA DreamerV2 and DrQ-v2 algorithms! Concretely, we adapt DreamerV2 ( @danijarh ) to the offline setting by introducing a penalty based on mean disagreement, resulting in Offline DV2. [2/N]
Tweet media one
1
0
4
@cong_ml
Cong Lu
2 years
This work was done during an awesome MAX internship over the summer hosted jointly by @MSFTResearchCam and @XboxStudio in the very lovely Cambridge. I’m super grateful to my hosts @ralgeorgescu and @rookboom , and all the friends made along the way! 🥰🥰
0
0
4
@cong_ml
Cong Lu
6 months
For context: In offline MBRL, existing methods usually assume that any issues are due to model errors. ➡️ Therefore, approaches are based around preventing model exploitation. However, using the perfect “oracle” dynamics causes existing methods to completely fail! 📉 [2/N]
Tweet media one
1
1
2
@cong_ml
Cong Lu
2 years
We found our base algorithms already represent a strong multitask baseline, opening the door to training generalist agents from offline data. 🤯 This could be because we can directly distinguish between different tasks instead of relying on explicit meta or multitask algos! [6/N]
Tweet media one
1
0
3
@cong_ml
Cong Lu
2 years
💥 ML Research Opportunity for under-represented undergrads at Oxford! 💥 Would appreciate help sharing this widely! UNIQ+ is an awesome way to spend two months getting stuck into ML at great groups @oxcsml @CompSciOxford See proposed projects here:
0
1
3
@cong_ml
Cong Lu
6 months
But, existing methods work?! We discuss how they inadvertently address edge-of-reach states despite their motivation from model error. Can we directly target the true problem? Yes! We introduce RAVL, which precisely corrects edge-of-reach states using value pessimism. [5/N]
Tweet media one
1
0
3
@cong_ml
Cong Lu
2 years
Towards this goal, we were curious to see how our baselines scaled with more data. Interestingly, the RL methods scale far better than BC, with gains of >30% compared to 10% when we go from 100K samples to 500K! This may have implications for when we scale even further! 🚀 [7/N]
Tweet media one
1
0
3
@cong_ml
Cong Lu
8 days
@Hoper_Tom @jeffclune Thank you for kindly sharing these works! We will discuss these in the updated version of the paper, and also look forward to integrating the insights from your paper into our work!
0
0
3
@cong_ml
Cong Lu
6 months
So, what is going on? We show that there exist “edge-of-reach” states which are used in training but which the agent can never sample actions from *even with unlimited model-based data collection* (as illustrated below). [3/N]
Tweet media one
2
0
2
@cong_ml
Cong Lu
2 years
We adapt the algorithm DrQ-v2 ( @denisyarats ) by adding an adaptive behavioral cloning term similar to TD3+BC, resulting in DrQ+BC. We also include a CQL and BC implementation in the same codebase. [3/N]
Tweet media one
1
0
3
@cong_ml
Cong Lu
1 year
Another key benefit we observe on some algorithms is the ability to scale up the network size and obtain better performance! RL algorithms are typically limited to training with very small networks and synthetic data could lead to the lifting of this restriction! [6/N]
Tweet media one
1
1
3
@cong_ml
Cong Lu
9 days
Given a starter research area (e.g. diffusion, language modeling), The AI Scientist generates novel research ideas, writes code, executes experiments, visualizes results, describes its findings by writing a full scientific paper, and then runs a review process for eval. [2/N]
Tweet media one
1
0
3
@cong_ml
Cong Lu
2 years
We find that the model-based Offline DV2 performs best on datasets with diverse data, DrQ+BC deals well with mixed but high-reward data and BC is best at the expert datasets. CQL is also a strong option for high-reward data. [4/N]
Tweet media one
1
0
3
@cong_ml
Cong Lu
2 years
@luisa_zintgraf @shimon8282 @katjahofmann Huge congratulations!! 🥳🥳🥳
0
0
1
@cong_ml
Cong Lu
6 months
RAVL is theoretically principled and reaches state-of-the-art on D4RL and pixel-based V-D4RL without any explicit dynamics penalty! 📈 Furthermore, these insights serve to correct and unify our understanding of offline RL across model-free and model-based approaches. [6/N]
Tweet media one
1
0
2
@cong_ml
Cong Lu
7 months
@jsuarez5341 @arankomatsuzaki We found this as well in a project that sounds v close to both these efforts! See Table 3 of where we show diffusion synthetic data can help scale the network sizes in TD3 :)
1
0
2
@cong_ml
Cong Lu
1 year
@Stone_Tao Thanks! At the moment, it's roughly 50/50 for diffusion vs. RL training. Big potential for speed-ups there though. DMC is on proprioceptive, visual transitions incoming! :)
1
0
2
@cong_ml
Cong Lu
1 year
Agent-controller representations: Principled offline rl with rich exogenous information () @riashatislam @manan_tomar learning how to handle rich amounts of irrelevant information commonly found in pixel-based datasets! [2/N]
1
0
2
@cong_ml
Cong Lu
1 year
@vladkurenkov @shaneguML @ML_is_overhyped Super cool work!! We also found a deeper networks to help in TD3+BC esp. with synthetic data ;) (Table 3 of )
0
0
2
@cong_ml
Cong Lu
3 months
@Abel_TorresM @jeffclune We discuss NetHack in the conclusion! 🚀
1
0
2
@cong_ml
Cong Lu
5 months
@percyliang @siddkaramcheti RoBERTa, rejected from ICLR, 12k cites
0
0
1
@cong_ml
Cong Lu
1 year
@Stone_Tao Yes, exactly :)
0
0
1
@cong_ml
Cong Lu
5 months
@TesfayZemuy Yes, they were certainly slower. No reason not to use diffusion models instead as the generative model :)
0
0
1
@cong_ml
Cong Lu
5 months
@tw_killian Congrats Dr!
1
0
1
@cong_ml
Cong Lu
3 months
1
0
1