nouhadziri Profile Banner
Nouha Dziri Profile
Nouha Dziri

@nouhadziri

Followers
3K
Following
4K
Media
101
Statuses
838

Research Scientist @allen_ai, PhD in NLP 🤖 UofA. Ex @GoogleDeepMind @MSFTResearch @MilaQuebec 🚨🚨 NEW BLOG about LLMs reasoning: https://t.co/Ox0iOaqY7e

Seattle, US
Joined February 2011
Don't wanna be here? Send us removal request.
@nouhadziri
Nouha Dziri
2 years
🚀📢 GPT models have blown our minds with their astonishing capabilities. But, do they truly acquire the ability to perform reasoning tasks that humans find easy to execute? NO⛔️. We investigate the limits of Transformers *empirically* and *theoretically* on compositional tasks🔥
Tweet media one
38
332
1K
@nouhadziri
Nouha Dziri
2 years
Ok, now you can call me Dr 🥳🥳 !! I just defended my PhD virtually and I'm so grateful for the unlimited support from friends, family, and mentors who always believed in me even in moments of doubt.
46
5
356
@nouhadziri
Nouha Dziri
1 year
it’s happening now!!! please stop by
Tweet media one
@nouhadziri
Nouha Dziri
1 year
Faith and fate paper: Limits of Transformers on Compositionality (Spotlight). 🗓️ Tue 12 Dec 11:45 am - 1:45 pm EST .📍 Great Hall & Hall B1+B2 .🆔 #421.
4
14
303
@nouhadziri
Nouha Dziri
2 years
Life news💫💫 I’m joining @allen_ai to work with the incredible @YejinChoinka as a postdoc. I have been a huge fan of Yejin for years, her work and talks inspire me hugely, I always admire her adventurous research vision, and I have never thought I would be able.
24
7
213
@nouhadziri
Nouha Dziri
2 years
We show that Transformers' successes are heavily linked to having seen significant portions of the required computation graph during training! This revelation, where models reduce multi-step reasoning into subgraph matching, raises questions about sparks of AGI claims 🦄
Tweet media one
1
32
194
@nouhadziri
Nouha Dziri
7 months
🚀💫Excited to announce the **AI2 Safety Toolkit**: an open and transparent research initiative dedicated to advancing LLMs safety. @allen_ai
Tweet media one
16
41
149
@nouhadziri
Nouha Dziri
2 years
We find that GPT3, ChatGPT, and GPT4 cannot fully solve compositional tasks even with in-context learning, fine-tuning, or using scratchpads. To understand when models succeed, and the nature of the failures, we represent a model’s reasoning through computation graphs.
Tweet media one
4
14
156
@nouhadziri
Nouha Dziri
3 years
🚀🚀 Super happy that my work at @GoogleAI "Evaluating attribution in dialogue systems: the BEGIN benchmark" got accepted at TACL 🥳 This is a work with wonderful collaborators Hannah Rashkin, @davidswelt and @tallinzen. Stay tuned for more details but in short: 👇.
8
17
143
@nouhadziri
Nouha Dziri
6 months
📢Super excited that our workshop "System 2 Reasoning At Scale" was accepted to #NeurIPS24, Vancouver! 🎉.🎯 how can we equip LMs with reasoning, moving beyond just scaling parameters and data?. Organized w. @stanfordnlp @MIT @Princeton @allen_ai @uwnlp . 🗓️ when? Dec 15 2024
Tweet media one
2
25
146
@nouhadziri
Nouha Dziri
10 days
Interested in knowing more about LLMs agents and in contributing to this topic?🚀. 📢We're thrilled to announce REALM: The first Workshop for Research on Agent Language Models 🤖 #ACL2025NLP in Vienna 🎻.We have an exciting lineup of speakers . 🗓️ Submit your work by *March 1st*
Tweet media one
2
32
143
@nouhadziri
Nouha Dziri
2 months
And that was a wrap #NeurIPS2024 was intense, fast-paced, rich, packed🔥Super happy with the success of Sys2 Reasoning: a true concentration of top AI figures who pioneered the field @Yoshua_Bengio @DBahdanau @jaseweston @fchollet @MelMitchell1 @dawnsongtweets Joshua Tenenbaum👇
Tweet media one
7
15
127
@nouhadziri
Nouha Dziri
2 years
To understand where Transformers fail, we analyze the errors made by transformers at different layers of the computation graph. We find that while models can execute single-step reasoning accurately, they struggle to plan and combine multiple steps for correct overall reasoning.
Tweet media one
3
9
111
@nouhadziri
Nouha Dziri
15 days
Reasoning, search & planning are trending these days so I'm thrilled to announce our #ICLR2025 Workshop on LLMs Reasoning and Planning to answer your most burning questions with our exciting lineup of speakers🔥 . ⏰Submit your work by Feb 2nd. 📄Details:
Tweet media one
1
19
108
@nouhadziri
Nouha Dziri
1 year
📢 Can generative models understand what they generate? The answer is no ⛔️.Check out our new work: .🔥the Generative AI Paradox🔥. We run extensive experiments investigating *generation* vs. *understanding* in generative models, across both language and image modalities.
@PeterWestTM
Peter West
1 year
Richard Feynman said “What I cannot create, I do not understand”💡. Generative Models CAN create, i.e. generate, but do they understand? Our 📣new work📣 finds that the answer might unintuitively be NO🚫 We call this the.💥Generative AI Paradox💥. paper:
Tweet media one
1
10
98
@nouhadziri
Nouha Dziri
2 years
We also provide theoretical insights into why models perform worse in compositional tasks as the problem size increases: whether you need independent applications or iterated applications of the same function to solve a task, the probability of error increases exponentially.
2
8
98
@nouhadziri
Nouha Dziri
4 months
Happy that WildGuard got accepted at #NeurIPS2024 D&B🚀🎉🎉🎉🎉 . Make your LLM safer by using our safety toolkit:. ⚔️WildGuard: .🦁WildTeaming:🔧 Evaluation suite:
Tweet media one
@nouhadziri
Nouha Dziri
7 months
Now, let's attack 🔥🔥WildGuard: Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs 🔥🔥
Tweet media one
0
7
94
@nouhadziri
Nouha Dziri
3 years
Made it safely to Punta Cana for #EMNLP2021 and what an amazing view from my room! Excited to meet physically again and chat :) Will be also available virtually. Please reach out to me if you'd like to chat about exciting topics in text generation.
Tweet media one
5
1
86
@nouhadziri
Nouha Dziri
4 years
Super honoured to be listed as an "Outstanding Reviewer" for #ACL2021 🥳 I hope everyone receives a fair judgement for the priceless efforts they put into each project.
1
3
81
@nouhadziri
Nouha Dziri
1 year
#NeurIPS2023 is fast approaching😍and I'm excited to tell you more about "Faith and Fate" (spotlight) on Dec 12. Dive into an overview by reading our blog post: 📢Check out new grokking results: 👩‍💻code: .
@nouhadziri
Nouha Dziri
2 years
🚀📢 GPT models have blown our minds with their astonishing capabilities. But, do they truly acquire the ability to perform reasoning tasks that humans find easy to execute? NO⛔️. We investigate the limits of Transformers *empirically* and *theoretically* on compositional tasks🔥
Tweet media one
1
17
78
@nouhadziri
Nouha Dziri
4 months
📢For a long time I’ve delayed writing blogs and was bothered by my views cropped in media articles if they ever went live.🥳So I’m starting my own blog where I freely write about topics I’m excited about. 🧐what o1 models are under the hood? Have they cracked human reasoning?
Tweet media one
8
12
78
@nouhadziri
Nouha Dziri
1 year
Is *RLHF* and other alignment techniques enough to represent diverse human values? NO🚫. Check out our work where we propose a roadmap to pluralistic alignment & discuss fundamental limitations of exiting methods. w @uwnlp @allen_ai @stanfordnlp @MIT.
@ma_tay_
Taylor Sorensen
1 year
🤔How can we align AI systems/LLMs 🤖 to better represent diverse human values and perspectives?💡🌍. We outline a roadmap to pluralistic alignment with concrete definitions for how AI systems and benchmarks can be pluralistic!. First, models can be…
Tweet media one
1
8
61
@nouhadziri
Nouha Dziri
4 years
Super happy to share that our work "Decomposed Mutual Information Estimation for Contrastive Representation Learning" got accepted at #ICML2021 🥳 .A huge thanks to my incredible mentors at @MSFTResearch MTL @murefil, @temporaer, Geoff Gordon!!!.
4
4
72
@nouhadziri
Nouha Dziri
6 years
I’m very honored to have met Jeff Dean @JeffDean during #WiML2018. I had a fun discussion with you and I received many insightful advice for my academic career. Thank you! #NeurIPS2018
Tweet media one
1
5
75
@nouhadziri
Nouha Dziri
2 years
📢📢 We are happy to announce that our workshop "Document-Grounded Dialogue and Conversational Question Answering" will be given at @aclmeeting 2023 in Toronto! 🥳.Link:
1
17
73
@nouhadziri
Nouha Dziri
3 years
Finally, BERT is named after me😂😂.
@KhalilMrini
Dr. Khalil خليل ⵅⴰⵍⵉⵍ
3 years
#Algeria's Darja 🇩🇿 now has its own language model: "DziriBERT: a Pre-trained Language Model for the Algerian Dialect" Congratulations @amine_abdaoui @MouhamedBerrimi on your great work!.
3
2
68
@nouhadziri
Nouha Dziri
2 months
🎊Excited for #neurips2024 and our "System 2 Reasoning at Scale" workshop. We have an excited lineup of speakers who will answer your most burning questions about AI and reasoning 🚀. 🔥Got spicy questions? Submit & vote here:.
3
10
70
@nouhadziri
Nouha Dziri
5 years
Words cannot simply express the grief that I'm feeling for this terrible loss of life and incredible talent! I still cannot believe what happened and how we lost Pouneh and Arash, a lovely and talented couple😢💔
1
1
69
@nouhadziri
Nouha Dziri
1 year
It was an honor to give a guest lecture today at the "Generative AI" class taught by @adjiboussodieng at the Princeton University. This lecture was a timely opportunity to discuss the concern about dangerous AI. More details below👇
Tweet media one
1
4
68
@nouhadziri
Nouha Dziri
4 years
📢 Excited to share our new preprint💥:. "Evaluating Groundedness in Dialogue Systems: The BEGIN Benchmark" 🤖. w/ Hannah Rashkin, Tal Linzen (@tallinzen) and David Reitter.[1/6]
Tweet media one
2
9
64
@nouhadziri
Nouha Dziri
3 years
Super thrilled about our paper being accepted at #EMNLP2021 main conference🥳🥳🥳 where we reduce hallucination in knowledge-grounded dialogue systems. A big shoutout to my collaborators for the great job @bose_joey, @AndreaMadotto and @ozaiane!!! Stay tuned for the camera-ready!.
2
4
65
@nouhadziri
Nouha Dziri
4 months
I won't be in #COLM2024 this year but checkout our poster about ✨LLM’s perception of world cultures✨. TDLR; LLMs fall short in capturing the full spectrum of global cultural nuances, leading to biases and uneven degree of diversity in their generations.
Tweet media one
3
5
66
@nouhadziri
Nouha Dziri
2 years
Check out more details here:
5
4
57
@nouhadziri
Nouha Dziri
4 months
🚨A new blog post✨"Safety Alignment is superficial".Discover why is that the case and learn why investing in safety is a priority and should not be taken as an afterthought⚠️. The California safety bill SB 1047 has created a big controversy between supporters and opponents, why?
Tweet media one
5
15
63
@nouhadziri
Nouha Dziri
1 year
✈️ I won't be at EMNLP this week but will be at #NeurIPS2023 next week. Super excited🥳to be presenting 3 papers including 2 spotlights. DM me if you want to chat about science of LMs, interpretability, and data. Check out details👇.
4
0
63
@nouhadziri
Nouha Dziri
1 year
📢Can LMs follow their own proposed rules? No!. LMs excel at rule generation but display puzzling inductive reasoning behavior, a sharp contrast to humans. 💫This work was led by my intern @Linlu who did a excellent job during the summer at Mosaic.💫. 📜
@linluqiu
Linlu Qiu
1 year
How good are LMs at inductive reasoning? How are their behaviors similar to/contrasted with those of humans?. We study these via iterative hypothesis refinement. We observe that LMs are phenomenal hypothesis proposers, but they also behave as puzzling inductive reasoners:. (1/n)
Tweet media one
2
11
57
@nouhadziri
Nouha Dziri
3 years
📢 Excited to share our new work 💥. FaithDial: A Faithful Benchmark for Information-Seeking Dialogue. 📄 🌐 👩‍💻 joint work w. @sivareddyg, @PontiEdoardo, @ehsk0, @ozaiane, Mo Yu, Sivan Milton.#NLProc.
2
18
60
@nouhadziri
Nouha Dziri
6 years
Super excited to present our paper "Evaluating Coherence in Dialogue Systems using Entailment" at #NAACL2019 with my great collaborators @ehsk0, @korymath and @ozaiane. What a great feeling to head into the weekend with :) @ualbertaScience @AmiiThinks @NAACLHLT.
5
8
57
@nouhadziri
Nouha Dziri
2 years
📢📢 Happy to announce our newest work: Self-Refine. LLMs can now improve without any additional training data, RL or human intervention! .They repeatedly refine their outputs through self-feedback, improving performance on 7 tasks. 📜 Paper:
@aman_madaan
Aman Madaan
2 years
Can LLMs enhance their own output without human guidance? In some cases, yes! With Self-Refine, LLMs generate feedback on their work, use it to improve the output, and repeat this process. Self-Refine improves GPT-3.5/4 outputs for a wide range of tasks.
Tweet media one
2
3
56
@nouhadziri
Nouha Dziri
2 months
Come to the #NeurIPS2024 inference-time scaling tutorial and join us for the panel for insightful and 🌶️🌶️ discussions . ⏰ when? Tue. Dec 10 @ 1:30 PM ET. This is so exciting!.
@wellecks
Sean Welleck
2 months
We're incredibly honored to have an amazing group of panelists: @agarwl_ , @polynoamial , @BeidiChen, @nouhadziri, @j_foerst , with @uralik1 moderating. We'll close with a panel discussion about scaling, inference-time strategies, the future of LLMs, and more!
Tweet media one
0
8
52
@nouhadziri
Nouha Dziri
2 months
Super excited today for the System 2 Reasoning at Scale workshop, come join us to discover how to equip AI systems with reasoning that's optimized for renewable energy and not fossil fuel 🔥🚀. ⏰When? today, 9am-5:30pm .📍West Ballroom B. #NeurIPS2024
Tweet media one
2
6
51
@nouhadziri
Nouha Dziri
3 years
Excited about this talk tomorrow, come join!!.
@stanfordnlp
Stanford NLP Group
3 years
At tomorrow's NLP Seminar, we are delighted to host Nouha Dziri (@nouhadziri), who will be talking about her work toward building faithful conversational models. Join us over zoom tomorrow at 11 am PT. Registration: Abstract:
Tweet media one
0
4
46
@nouhadziri
Nouha Dziri
1 year
Ok after a hectic start to the year with all the deadlines, happy to share that I have 3 papers accepted at ICLR, 1 Oral 🥳and 2 posters. So see you in Viennaa🎵🎻🎹. See details 👇.
6
0
44
@nouhadziri
Nouha Dziri
4 months
🚀🚀Super excited that WildTeaming got accepted at #neurips2024 (main track)🎉🎉🎉🎉🎉. See you in Vancouver to discuss LLMs safety, red-teaming, reasoning and more. Also stay tuned for a safety blogpost coming out early next week🚀🚀 #LLMs.
@liweijianglw
Liwei Jiang
7 months
We introduce 🦁WildTeaming🦁, an automatic red-team framework to compose human-like adversarial attacks using diverse jailbreak tactics devised by creative and self-motivated users in-the-wild. (2/N)
Tweet media one
2
5
43
@nouhadziri
Nouha Dziri
4 years
@lexfridman This is not a conflict, stop white-washing and hiding Israel's war crimes. How can you describe it as a conflict when heavily armed police forces are murdering, bombing unarmed Palestinians?!!.
5
2
39
@nouhadziri
Nouha Dziri
6 years
New work "Evaluating Coherence in Dialogue Systems using Entailment" to appear at @NAACLHLT 2019 @ehsk0 @korymath @ozaiane. Our approach measures coherence in dialogs allowing an unbiased and a fast evaluation. Coda/data: Paper:
0
4
39
@nouhadziri
Nouha Dziri
4 years
😂😂.
@derGeistbot
geist
4 years
the abstract the paper
Tweet media one
Tweet media two
1
3
37
@nouhadziri
Nouha Dziri
1 year
Happening now. Come chat about fine-grained RLHF!!. Hall B2.#324
Tweet media one
@nouhadziri
Nouha Dziri
1 year
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training (Spotlight). 🗓️ Tue 12 Dec 6:15 pm - 8:15 pm EST.📍 Great Hall & Hall B1+B2.🆔 #324.
1
4
37
@nouhadziri
Nouha Dziri
2 months
Come to the inference-time tutorial today🔥. ⏰When? Today at 1:30pm PST.📍Where? West Exhibition Hall C. #NeurIPS2024
Tweet media one
@nouhadziri
Nouha Dziri
2 months
Come to the #NeurIPS2024 inference-time scaling tutorial and join us for the panel for insightful and 🌶️🌶️ discussions . ⏰ when? Tue. Dec 10 @ 1:30 PM ET. This is so exciting!.
3
6
37
@nouhadziri
Nouha Dziri
2 years
@ThomasMiconi Hi Thomas, the primary focus of Zhou et al. is to explore possibilities for improving the model's performance without necessarily aiming for complete mastery of the task. While they have indeed made enhancements compared to baseline approaches, they did not solve the task on OOD.
1
0
36
@nouhadziri
Nouha Dziri
4 months
🔔New work on measuring the risks of human .over-reliance on LLM expressions of uncertainty. The more capable LLMs the better they are at deceiving people. We introduce an evaluation framework that measures when humans rely on LLM generations. 👇🧵.w. @stanfordnlp @allen_ai CMU
Tweet media one
@KaitlynZhou
Kaitlyn Zhou ✈️ CSCW, EMNLP!
4 months
How can we best measure the consequences of LLM overconfidence?. ✨New preprint✨ on measuring the risks of human over-reliance on LLM expressions of uncertainty: w/@JenaHwang2 @xiangrenNLP @nouhadziri @jurafsky @MaartenSap @stanfordnlp @allen_ai #NLPproc
Tweet media one
1
3
36
@nouhadziri
Nouha Dziri
2 months
We’ve kicked off the workshop! Join us in 📍West Ballroom B, and don’t forget to share your burning questions for the panel here:. 🌶️🔥Questions: #NeurIPS2024.
@nouhadziri
Nouha Dziri
2 months
Super excited today for the System 2 Reasoning at Scale workshop, come join us to discover how to equip AI systems with reasoning that's optimized for renewable energy and not fossil fuel 🔥🚀. ⏰When? today, 9am-5:30pm .📍West Ballroom B. #NeurIPS2024
Tweet media one
0
7
35
@nouhadziri
Nouha Dziri
4 months
⏳⏳⏳Only 3 days left to submit your reasoning work to our "System 2 Reasoning At Scale" workshop. #NeurIPS24. ⏲️🚨🚨🚨Deadline: Sept 23, 2024, AOE.
@nouhadziri
Nouha Dziri
6 months
📢Super excited that our workshop "System 2 Reasoning At Scale" was accepted to #NeurIPS24, Vancouver! 🎉.🎯 how can we equip LMs with reasoning, moving beyond just scaling parameters and data?. Organized w. @stanfordnlp @MIT @Princeton @allen_ai @uwnlp . 🗓️ when? Dec 15 2024
Tweet media one
1
4
35
@nouhadziri
Nouha Dziri
2 months
l had the pleasure yesterday to lead the agentic AI safety sessions in @mozilla @columbia safety convening with @sayashk and meet with the🇫🇷French Minister of AI, Madame Clara Chappaz with whom I discussed the urgency of regulatory frameworks to enable safe deployment of agents👇
Tweet media one
1
1
33
@nouhadziri
Nouha Dziri
2 years
Excited to be at #ACL2023NLP Toronto in person🇨🇦 I'm presenting 2 papers🥳: 1 TACL poster on Monday, 1 oral on Tuesday. Happy to meet and chat!!. Here are some details:.
1
0
33
@nouhadziri
Nouha Dziri
2 months
Come to the Language Gamification workshop to learn about multi-agent learning, LLMs search, inference-time algorithms, the talks are really high quality and the speakers are top-notch🔥. ⏰When? today .📍Where? West Meeting Room 220-222. #NeurIPS2024
Tweet media one
@nouhadziri
Nouha Dziri
2 months
Come to my talk about in-context learning, inference-time algorithms and limits of Transformers LLMs in reasoning tasks🔥. ⏰When? today Dec 14.📍Where? West Meeting Room 220-222.
1
7
33
@nouhadziri
Nouha Dziri
6 years
Heading to Florence for @ACL2019_Italy 🇮🇹. Excited to meet great researchers and to explore the sheer amount of inspirational NLP papers in different areas. Ping me if you want to chat about dialogue :).
0
0
28
@nouhadziri
Nouha Dziri
2 years
📢Delayed announcement: happy to share that FaithDial has been accepted at TACL 🚀🥳 Please consider using this dataset if you want to avoid hallucination in your response generation. Paper: .Dataset: Code:
@nouhadziri
Nouha Dziri
3 years
📢 Excited to share our new work 💥. FaithDial: A Faithful Benchmark for Information-Seeking Dialogue. 📄 🌐 👩‍💻 joint work w. @sivareddyg, @PontiEdoardo, @ehsk0, @ozaiane, Mo Yu, Sivan Milton.#NLProc.
5
8
30
@nouhadziri
Nouha Dziri
2 years
Congratulations @ai2_mosaic🥳🚀🔥🔥 1 best paper award and 3 outstanding papers at #ACL2023NLP .Congrats to all the authors!!! @jmhessel @YejinChoinka @MaartenSap @melaniesclar @xiang @PeterWestTM @alsuhr @jenahwan et al.!!.
@allen_ai
Ai2
2 years
Congratulations to the team, including AI2ers, who worked on "Do Androids Laugh at Electric Sheep? Humor 'Understanding' Benchmarks from The New Yorker Caption Contest" — selected for a Best Paper Award at #ACL2023!
1
0
30
@nouhadziri
Nouha Dziri
3 years
📢 Excited to share our new work *Neural Path Hunter* (NPH) #EMNLP2021 . Paper: NPH focuses on enforcing faithfulness in KG-grounded dialogue systems by refining hallucinations via queries over a k-hop subgraph [1/6]. w @bose_joey @AndreaMadotto @ozaiane
Tweet media one
3
5
29
@nouhadziri
Nouha Dziri
11 months
Reward models are the essence of success in RLHF, yet there has been little focus on evaluating them 😬.We introduce RewardBench💥 the first benchmark for reward models. We evaluated 30+ of the existing RMs (w/ DPO) and created new datasets. Discover lots of insightful analyses👇.
@natolambert
Nathan Lambert
11 months
Excited to share something that we've needed since the early open RLHF days: RewardBench, the first benchmark for reward models. 1. We evaluated 30+ of the currently available RMs (w/ DPO too). 2. We created new datasets covering chat, safety, code, math, etc. We learned a lot.
Tweet media one
Tweet media two
Tweet media three
0
0
28
@nouhadziri
Nouha Dziri
1 year
📢🔥New work: We introduce a dataset (*ValuePrism*) and model (*Kaleido*) to engage AI with human values, rights, and duties in joint CS + Philosophy work with many wonderful people. Paper: .Demo:
@ma_tay_
Taylor Sorensen
1 year
Human values are crucial to human decision-making. We may consider health or environmental responsibility when deciding to bike to work, or weigh loyalty against honesty when deciding if we should lie to protect a friend. (0/n)
Tweet media one
Tweet media two
1
7
30
@nouhadziri
Nouha Dziri
9 months
Super excited be in Vienna #ICLR2024 next week 🎻🎵🎹😍 Will be presenting 1 oral and 2 posters. Please DM to talk about these works and also *SAFETY* in LMs. Details👇.
1
1
29
@nouhadziri
Nouha Dziri
2 years
🚀🚀 Super happy that our work got accepted at #ACL2023NLP main conference. TDLR; we highlight the pitfalls (rigidity of lexical matching) and brittleness (inability to spot hallucination) of existing evaluation mechanisms in information-seeking QA.🔥🔥.See you all in Toronto🥳.
@ehsk0
Ehsan Kamalloo
2 years
🔥🔥 Our work "Evaluating Open-Domain Question Answering in the Era of Large Language Models" w/ @nouhadziri @claclarke @DavoodRafiei got accepted to the #ACL2023NLP main conference. Stay tuned 🎸 for more details!. #ACL2023 #ACL2023Toronto #NLProc.
0
1
29
@nouhadziri
Nouha Dziri
2 months
Come to hear about WildGuard🔥. 📍East Exhibit Hall A-C #4208
Tweet media one
@nouhadziri
Nouha Dziri
2 months
Present at the main conf the safety toolkit with the amazing team:. * WildGuard: ⏰12 Dec 4:30-7:30 PM.* WildTeaming: ⏰13 Dec 4:30-7:30 PM. Use our tools:.⚔️Moderation Tools: 🦁 Red-teaming: 🔎Evaluation: 👇.
1
2
28
@nouhadziri
Nouha Dziri
5 years
End of a wonderful experience at @MSFTResearch Montreal with a super talented and amazing team @shikhar_warlock @murefil @temporaer and Goeff Gordon !! Looking forward to the next steps and to the surprises the new year is bringing! Happy holidays everyone🎉
Tweet media one
0
3
27
@nouhadziri
Nouha Dziri
1 year
@natolambert @hamishivi This aligns with our recent work "Fine-Grained RLHF" which will be presented at #NeurIPS2023 led by the great @zeqiuwu1 .A framework that enables training and learning from different reward functions (factual incorrectness, irrelevance, toxicity, etc).
1
3
27
@nouhadziri
Nouha Dziri
4 years
📢 Excited to share our paper *DEMI* #ICML2021. Paper: DEMI aims to decompose a hard estimation problem into a smaller subproblems that can be potentially solved with less bias. w/ @murefil, @temporaer, @RemiTachet,@philip_bachman, Goeff Gordon [1/n].
3
4
26
@nouhadziri
Nouha Dziri
2 months
Come to my talk about in-context learning, inference-time algorithms and limits of Transformers LLMs in reasoning tasks🔥. ⏰When? today Dec 14.📍Where? West Meeting Room 220-222.
@MLLanguageGames
Language Gamification Workshop @ NeurIPS 2024
2 months
Talk 1: In-context Learning in LLMs: Potential and Limits. 🎤 @nouhadziri .🕕 08:30-09:10. 🧵2/8.#NeurIPS2024
Tweet media one
2
2
26
@nouhadziri
Nouha Dziri
5 years
✈️Heading to San Francisco to give a talk about evaluating open-ended dialogue systems🤖 at the #RasaDevSummit. Check out the exciting agenda and the awesome speakers here: @Rasa_HQ
Tweet media one
0
4
26
@nouhadziri
Nouha Dziri
3 years
@SashaMTL @emnlpmeeting People from the Middle East receive all sorts of discrimination and racism. They're even banned to enter the US and if they receive a study visa, they won't be able to leave the country or see their families. Why no one speaks about discrimination against them?.
2
1
24
@nouhadziri
Nouha Dziri
3 years
I'm deeply saddened to know that Canadian universities are banning international travel to conferences. Knowing that #EMNLP2021 will be in-person this year was what kept us (me and many friends) motivated to do research more enthusiastically.
1
3
25
@nouhadziri
Nouha Dziri
5 years
Had so much fun today presenting a talk about dialogue evaluation at @DeepMindAI Montreal :) Thanks @korymath for being such a great host !!!.
0
1
25
@nouhadziri
Nouha Dziri
1 year
Stop by our "Faith and Fate" poster tomorrow 11:45 am - 1:45 pm EST . 📍 Great Hall & Hall B1+B2 .🆔#421. Details:
@nouhadziri
Nouha Dziri
1 year
Faith and fate paper: Limits of Transformers on Compositionality (Spotlight). 🗓️ Tue 12 Dec 11:45 am - 1:45 pm EST .📍 Great Hall & Hall B1+B2 .🆔 #421.
0
2
25
@nouhadziri
Nouha Dziri
3 years
@edmontonpolice We're having a nightmare in downtown. You MUST stop them from honking and disturbing our lives. We have work to do and we can't afford another chaotic Saturday.
1
0
24
@nouhadziri
Nouha Dziri
3 months
📢⚠️ We're offering a few free NeurIPS tickets for students who have papers accepted and lack financial resources to attend. Please fill up this form asap:
@ShikharMurty
Shikhar
3 months
⚠️Attention NeurIPS attendees (and especially students with papers accepted at the System-2 Reasoning workshop!). We have a few free NeurIPS tickets available! If you're an enrolled student and would like to be considered, please fill out this form:
0
4
24
@nouhadziri
Nouha Dziri
2 years
@ThomasMiconi Our findings reveal that accomplishing this is not a simple task, and we offer insights into why reaching full mastery is inherently challenging.
2
0
24
@nouhadziri
Nouha Dziri
1 year
Come learn about Self-Refine. happening now. ID: 405.Hall B1 & B2
Tweet media one
@nouhadziri
Nouha Dziri
2 years
📢📢 Happy to announce our newest work: Self-Refine. LLMs can now improve without any additional training data, RL or human intervention! .They repeatedly refine their outputs through self-feedback, improving performance on 7 tasks. 📜 Paper:
0
5
23
@nouhadziri
Nouha Dziri
2 years
@ThomasMiconi In contrast, our work delves into investigating the fundamental limits of achieving full mastery of the task. We seek to examine whether we could achieve 100% performance in both in-domain and OOD settings by pushing transformers to their limits.
2
0
23
@nouhadziri
Nouha Dziri
6 years
#NAACL19 friends: if you're interested in knowing more about evaluating dialog, I'll be talking about "Evaluating Coherence in Dialogue Systems using Entailment" on Wednesday in Hyatt Exhibit Hall at 1:30 PM. This is a joint work with @ehsk0 , @korymath, and @ozaiane.
0
3
22
@nouhadziri
Nouha Dziri
11 months
Why LMs perform better when we "motivate" them or ask them "nicely"? 🤔Would this and certain prompting tricks lead to bypassing safety guards? YES ✅.Discover some hypotheses in this piece:.
@allen_ai
Ai2
11 months
Prompts heavily influence model outputs — and can bypass safety measures to trigger harmful results. @nouhadziri spoke with @Kyle_L_Wiggers about chatbot prompts for @TechCrunch:.
0
2
21
@nouhadziri
Nouha Dziri
3 years
Super excited to be attending in person #NAACL2022. Please reach out if you're interested in conversational AI and trustworthiness. I will be also presenting a poster on this topic on Monday from 2:30pm to 4pm (PDT) at Regency A & B.
@nouhadziri
Nouha Dziri
3 years
We recently show that existing knowledge-grounded dialogue benchmarks (e.g., Wizard of Wikipedia; WoW) suffer from hallucination at an alarming level (>60% of responses) . We perform a comprehensive linguistic analysis in our new #NAACL22 work. 📄
Tweet media one
1
0
22
@nouhadziri
Nouha Dziri
4 months
Multimodal Olmo is out, fully open!! Congrats to the team🎉.- Beating giant models: e.g., Claude 3.5 Sonnet.- Scaling era is somehow stagnating .- high-quality data + alignment algorithms + test-time decoding are the secret sauce for a powerful model.
@allen_ai
Ai2
4 months
Meet Molmo: a family of open, state-of-the-art multimodal AI models. Our best model outperforms proprietary systems, using 1000x less data. Molmo doesn't just understand multimodal data—it acts on it, enabling rich interactions in both the physical and virtual worlds. Try it
0
0
21
@nouhadziri
Nouha Dziri
2 years
Happening now at Frontenac Ballroom (Board 2) #ACL2023NLP
Tweet media one
@nouhadziri
Nouha Dziri
3 years
📢 Excited to share our new work 💥. FaithDial: A Faithful Benchmark for Information-Seeking Dialogue. 📄 🌐 👩‍💻 joint work w. @sivareddyg, @PontiEdoardo, @ehsk0, @ozaiane, Mo Yu, Sivan Milton.#NLProc.
0
5
21
@nouhadziri
Nouha Dziri
9 months
Come listen to @linluqiu talking about the puzzling behavior of LLMs in inductive reasoning and iterative hypothesis refinement. Hall A8-9 #ICLR2024 🔥
Tweet media one
@nouhadziri
Nouha Dziri
9 months
Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement (Oral). 🗓️ (Talk) May 10 a.m. CEST — 10:45 a.m. CEST [Oral 3A].🗓️ (Poster) Wed 8 May 10:45 a.m. CEST — 12:45 p.m. CEST.📍Halle B.
0
0
21
@nouhadziri
Nouha Dziri
2 months
🚀New work🔥CREATIVITY Index🔥 .This work is SO close to my heart, I loved every part of the experiments. It provided me with much scientific fulfillment. Intellectual works have become so rare in the hysterical race of AI, so if you care about science give this work a read🧵
Tweet media one
@GXiming
Ximing Lu
2 months
Are LLMs 🤖 as creative as humans 👩‍🎓? Not quite!. Introducing CREATIVITY INDEX: a metric that quantifies the linguistic creativity of a text by reconstructing it from existing text snippets on the web. Spoiler: professional human writers like Hemingway are still far more creative
Tweet media one
1
2
21
@nouhadziri
Nouha Dziri
2 years
#emnlp2022 here I come!!! A verryyy long flight ✈️✈️, but excited to see many people and chat!!.
0
1
20
@nouhadziri
Nouha Dziri
2 years
📢🚀🚀🥳 Excited to share our new work led by the fantastic @ndaheim_. Elastic Weight Removal for Faithful and Abstractive Dialogue Generation. 📄 👩‍💻 . joint work w. @PontiEdoardo, @IGurevych @mrinmayasachan.#NLProc.
@PontiEdoardo
Edoardo Ponti
2 years
Large language models often generate hallucinated responses. We introduce Elastic Weight Removal (EWR), a novel method for faithful *and* abstractive dialogue. 📃💻+other methods!.🧑‍🔬@ndaheim_ @nouhadziri @IGurevych @mrinmayasachan.
1
2
18
@nouhadziri
Nouha Dziri
3 years
@LucianaBenotti The remote experience at #acl2022nlp was terrible. As someone attending from north america, I can barely attend anything live. Tutorials, workshops, some keynote talks were not uploaded in underline and they are still not. Organizers don't reply.
4
0
19
@nouhadziri
Nouha Dziri
2 months
#NeurIPS2024 remains my favorite venue, excited to be there next week🥳Reach out to chat, I will . *Speak at the panel of the Inference-Time Algorithms tutorial ⏰Dec 10 . *Give a talk at the Language Gamification workshop: in-context learning /scaling inference ⏰Dec 14. More👇.
1
1
19
@nouhadziri
Nouha Dziri
1 year
Faith and fate paper: Limits of Transformers on Compositionality (Spotlight). 🗓️ Tue 12 Dec 11:45 am - 1:45 pm EST .📍 Great Hall & Hall B1+B2 .🆔 #421.
@nouhadziri
Nouha Dziri
2 years
🚀📢 GPT models have blown our minds with their astonishing capabilities. But, do they truly acquire the ability to perform reasoning tasks that humans find easy to execute? NO⛔️. We investigate the limits of Transformers *empirically* and *theoretically* on compositional tasks🔥
Tweet media one
1
1
17
@nouhadziri
Nouha Dziri
3 years
Is #NAACL2022 not using Whova app this year??? if yes, what's the alternative?.
0
1
16
@nouhadziri
Nouha Dziri
4 months
Very cool work! This is a reminder again that scoring high on a math benchmark does not mean necessarily that LMs can truly "reason" or "think". It validates "faith and fate" observations for skeptics🙂.You may think of pattern-matching as one type of reasoning which can indeed👇.
@MFarajtabar
Mehrdad Farajtabar
4 months
1/ Can Large Language Models (LLMs) truly reason? Or are they just sophisticated pattern matchers? In our latest preprint, we explore this key question through a large-scale study of both open-source like Llama, Phi, Gemma, and Mistral and leading closed models, including the
Tweet media one
1
1
17
@nouhadziri
Nouha Dziri
2 years
The sad part is I didn't get the chance to commemorate the moment by taking photos, I even forgot to take screenshots :( Anyways, it's been a rollercoaster of emotions, the journey was not easy, but I'm happy about the things I've achieved!!.
1
1
17
@nouhadziri
Nouha Dziri
4 years
Please be nice and have good-faith in authors! Stop destructive and mean reviews!!.
0
0
14
@nouhadziri
Nouha Dziri
7 years
Pionners of #DeepLearning and #ReinforcementLearning Yoshua Bengio, Richard Sutton and Goeffrey Hinton together!! What a GREAT panel talk! #DLRL @VectorInst @AmiiThinks @MILAMontreal
Tweet media one
1
2
17
@nouhadziri
Nouha Dziri
6 years
#NeurIPS2018 has launched! Starting off by attending #GoogleAI talk about Machine Learning fairness.
Tweet media one
0
3
15
@nouhadziri
Nouha Dziri
2 years
A huge shoutout to my awesome supervisor @ozaiane, to my wonderful mentor and examiner @alonamarie, to the gentle Angel Chang, and to my external committee Jackie Cheung and Colin Cherry!.
1
0
16
@nouhadziri
Nouha Dziri
1 year
Congrats to all the authors 🥳🔥another outstanding paper from @ai2_mosaic @allen_ai.
@hyunw_kim
Hyunwoo Kim
1 year
🏆 SODA won the outstanding paper award at #EMNLP2023 ! What an incredible journey that was! I'm grateful for this journey with you @jmhessel @liweijianglw @PeterWestTM @GXiming @YoungjaeYu3 @peizNLP @Ronan_LeBras @malihealikhani @gunheekim @MaartenSap @YejinChoinka & @allen_ai💙
Tweet media one
Tweet media two
0
0
16
@nouhadziri
Nouha Dziri
5 years
Happy Halloween from MSR 🎃🎃😃🎉
Tweet media one
0
0
15