Nouha Dziri @nouhadziri profile

Nouha Dziri

@nouhadziri

Followers

3K

Following

4K

Media

101

Statuses

838

Research Scientist @allen_ai, PhD in NLP 🤖 UofA. Ex @GoogleDeepMind @MSFTResearch @MilaQuebec 🚨🚨 NEW BLOG about LLMs reasoning: https://t.co/Ox0iOaqY7e

Seattle, US

Joined February 2011

Don't wanna be here? Send us removal request.

Nouha Dziri

@nouhadziri

2 years

🚀📢 GPT models have blown our minds with their astonishing capabilities. But, do they truly acquire the ability to perform reasoning tasks that humans find easy to execute? NO⛔️. We investigate the limits of Transformers *empirically* and *theoretically* on compositional tasks🔥

38

332

1K

Nouha Dziri

@nouhadziri

2 years

Ok, now you can call me Dr 🥳🥳 !! I just defended my PhD virtually and I'm so grateful for the unlimited support from friends, family, and mentors who always believed in me even in moments of doubt.

46

5

356

Nouha Dziri

@nouhadziri

1 year

it’s happening now!!! please stop by

Nouha Dziri

@nouhadziri

1 year

Faith and fate paper: Limits of Transformers on Compositionality (Spotlight). 🗓️ Tue 12 Dec 11:45 am - 1:45 pm EST .📍 Great Hall & Hall B1+B2 .🆔 #421.

4

14

303

Nouha Dziri

@nouhadziri

2 years

Life news💫💫 I’m joining @allen_ai to work with the incredible @YejinChoinka as a postdoc. I have been a huge fan of Yejin for years, her work and talks inspire me hugely, I always admire her adventurous research vision, and I have never thought I would be able.

24

7

213

Nouha Dziri

@nouhadziri

2 years

We show that Transformers' successes are heavily linked to having seen significant portions of the required computation graph during training! This revelation, where models reduce multi-step reasoning into subgraph matching, raises questions about sparks of AGI claims 🦄

1

32

194

Nouha Dziri

@nouhadziri

7 months

🚀💫Excited to announce the **AI2 Safety Toolkit**: an open and transparent research initiative dedicated to advancing LLMs safety. @allen_ai

16

41

149

Nouha Dziri

@nouhadziri

2 years

We find that GPT3, ChatGPT, and GPT4 cannot fully solve compositional tasks even with in-context learning, fine-tuning, or using scratchpads. To understand when models succeed, and the nature of the failures, we represent a model’s reasoning through computation graphs.

4

14

156

Nouha Dziri

@nouhadziri

3 years

🚀🚀 Super happy that my work at @GoogleAI "Evaluating attribution in dialogue systems: the BEGIN benchmark" got accepted at TACL 🥳 This is a work with wonderful collaborators Hannah Rashkin, @davidswelt and @tallinzen. Stay tuned for more details but in short: 👇.

8

17

143

Nouha Dziri

@nouhadziri

6 months

📢Super excited that our workshop "System 2 Reasoning At Scale" was accepted to #NeurIPS24, Vancouver! 🎉.🎯 how can we equip LMs with reasoning, moving beyond just scaling parameters and data?. Organized w. @stanfordnlp @MIT @Princeton @allen_ai @uwnlp . 🗓️ when? Dec 15 2024

2

25

146

Nouha Dziri

@nouhadziri

10 days

Interested in knowing more about LLMs agents and in contributing to this topic?🚀. 📢We're thrilled to announce REALM: The first Workshop for Research on Agent Language Models 🤖 #ACL2025NLP in Vienna 🎻.We have an exciting lineup of speakers . 🗓️ Submit your work by *March 1st*

2

32

143

Nouha Dziri

@nouhadziri

2 months

And that was a wrap #NeurIPS2024 was intense, fast-paced, rich, packed🔥Super happy with the success of Sys2 Reasoning: a true concentration of top AI figures who pioneered the field @Yoshua_Bengio @DBahdanau @jaseweston @fchollet @MelMitchell1 @dawnsongtweets Joshua Tenenbaum👇

7

15

127

Nouha Dziri

@nouhadziri

2 years

To understand where Transformers fail, we analyze the errors made by transformers at different layers of the computation graph. We find that while models can execute single-step reasoning accurately, they struggle to plan and combine multiple steps for correct overall reasoning.

3

9

111

Nouha Dziri

@nouhadziri

15 days

Reasoning, search & planning are trending these days so I'm thrilled to announce our #ICLR2025 Workshop on LLMs Reasoning and Planning to answer your most burning questions with our exciting lineup of speakers🔥 . ⏰Submit your work by Feb 2nd. 📄Details:

1

19

108

Nouha Dziri

@nouhadziri

1 year

📢 Can generative models understand what they generate? The answer is no ⛔️.Check out our new work: .🔥the Generative AI Paradox🔥. We run extensive experiments investigating *generation* vs. *understanding* in generative models, across both language and image modalities.

Peter West

@PeterWestTM

1 year

Richard Feynman said “What I cannot create, I do not understand”💡. Generative Models CAN create, i.e. generate, but do they understand? Our 📣new work📣 finds that the answer might unintuitively be NO🚫 We call this the.💥Generative AI Paradox💥. paper:

1

10

98

Nouha Dziri

@nouhadziri

2 years

We also provide theoretical insights into why models perform worse in compositional tasks as the problem size increases: whether you need independent applications or iterated applications of the same function to solve a task, the probability of error increases exponentially.

2

8

98

Nouha Dziri

@nouhadziri

4 months

Happy that WildGuard got accepted at #NeurIPS2024 D&B🚀🎉🎉🎉🎉 . Make your LLM safer by using our safety toolkit:. ⚔️WildGuard: .🦁WildTeaming:🔧 Evaluation suite:

Nouha Dziri

@nouhadziri

7 months

Now, let's attack 🔥🔥WildGuard: Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs 🔥🔥

0

7

94

Nouha Dziri

@nouhadziri

3 years

Made it safely to Punta Cana for #EMNLP2021 and what an amazing view from my room! Excited to meet physically again and chat :) Will be also available virtually. Please reach out to me if you'd like to chat about exciting topics in text generation.

5

1

86

Nouha Dziri

@nouhadziri

4 years

Super honoured to be listed as an "Outstanding Reviewer" for #ACL2021 🥳 I hope everyone receives a fair judgement for the priceless efforts they put into each project.

1

3

81

Nouha Dziri

@nouhadziri

1 year

#NeurIPS2023 is fast approaching😍and I'm excited to tell you more about "Faith and Fate" (spotlight) on Dec 12. Dive into an overview by reading our blog post: 📢Check out new grokking results: 👩‍💻code: .

Nouha Dziri

@nouhadziri

2 years

🚀📢 GPT models have blown our minds with their astonishing capabilities. But, do they truly acquire the ability to perform reasoning tasks that humans find easy to execute? NO⛔️. We investigate the limits of Transformers *empirically* and *theoretically* on compositional tasks🔥

1

17

78

Nouha Dziri

@nouhadziri

4 months

📢For a long time I’ve delayed writing blogs and was bothered by my views cropped in media articles if they ever went live.🥳So I’m starting my own blog where I freely write about topics I’m excited about. 🧐what o1 models are under the hood? Have they cracked human reasoning?

8

12

78

Nouha Dziri

@nouhadziri

1 year

Is *RLHF* and other alignment techniques enough to represent diverse human values? NO🚫. Check out our work where we propose a roadmap to pluralistic alignment & discuss fundamental limitations of exiting methods. w @uwnlp @allen_ai @stanfordnlp @MIT.

Taylor Sorensen

@ma_tay_

1 year

🤔How can we align AI systems/LLMs 🤖 to better represent diverse human values and perspectives?💡🌍. We outline a roadmap to pluralistic alignment with concrete definitions for how AI systems and benchmarks can be pluralistic!. First, models can be…

1

8

61

Nouha Dziri

@nouhadziri

4 years

Super happy to share that our work "Decomposed Mutual Information Estimation for Contrastive Representation Learning" got accepted at #ICML2021 🥳 .A huge thanks to my incredible mentors at @MSFTResearch MTL @murefil, @temporaer, Geoff Gordon!!!.

4

72

Nouha Dziri

@nouhadziri

6 years

I’m very honored to have met Jeff Dean @JeffDean during #WiML2018. I had a fun discussion with you and I received many insightful advice for my academic career. Thank you! #NeurIPS2018

1

5

75

Nouha Dziri

@nouhadziri

2 years

📢📢 We are happy to announce that our workshop "Document-Grounded Dialogue and Conversational Question Answering" will be given at @aclmeeting 2023 in Toronto! 🥳.Link:

1

17

73

Nouha Dziri

@nouhadziri

3 years

Finally, BERT is named after me😂😂.

Dr. Khalil خليل ⵅⴰⵍⵉⵍ

@KhalilMrini

3 years

#Algeria's Darja 🇩🇿 now has its own language model: "DziriBERT: a Pre-trained Language Model for the Algerian Dialect" Congratulations @amine_abdaoui @MouhamedBerrimi on your great work!.

3

2

68

Nouha Dziri

@nouhadziri

2 months

🎊Excited for #neurips2024 and our "System 2 Reasoning at Scale" workshop. We have an excited lineup of speakers who will answer your most burning questions about AI and reasoning 🚀. 🔥Got spicy questions? Submit & vote here:.

3

10

70

Nouha Dziri

@nouhadziri

5 years

Words cannot simply express the grief that I'm feeling for this terrible loss of life and incredible talent! I still cannot believe what happened and how we lost Pouneh and Arash, a lovely and talented couple😢💔

1

69

Nouha Dziri

@nouhadziri

1 year

It was an honor to give a guest lecture today at the "Generative AI" class taught by @adjiboussodieng at the Princeton University. This lecture was a timely opportunity to discuss the concern about dangerous AI. More details below👇

1

4

68

Nouha Dziri

@nouhadziri

4 years

📢 Excited to share our new preprint💥:. "Evaluating Groundedness in Dialogue Systems: The BEGIN Benchmark" 🤖. w/ Hannah Rashkin, Tal Linzen (@tallinzen) and David Reitter.[1/6]

2

9

64

Nouha Dziri

@nouhadziri

3 years

Super thrilled about our paper being accepted at #EMNLP2021 main conference🥳🥳🥳 where we reduce hallucination in knowledge-grounded dialogue systems. A big shoutout to my collaborators for the great job @bose_joey, @AndreaMadotto and @ozaiane!!! Stay tuned for the camera-ready!.

2

4

65

Nouha Dziri

@nouhadziri

4 months

I won't be in #COLM2024 this year but checkout our poster about ✨LLM’s perception of world cultures✨. TDLR; LLMs fall short in capturing the full spectrum of global cultural nuances, leading to biases and uneven degree of diversity in their generations.

3

5

66

Nouha Dziri

@nouhadziri

2 years

Check out more details here:

5

4

57

Nouha Dziri

@nouhadziri

4 months

🚨A new blog post✨"Safety Alignment is superficial".Discover why is that the case and learn why investing in safety is a priority and should not be taken as an afterthought⚠️. The California safety bill SB 1047 has created a big controversy between supporters and opponents, why?

5

15

63

Nouha Dziri

@nouhadziri

1 year

✈️ I won't be at EMNLP this week but will be at #NeurIPS2023 next week. Super excited🥳to be presenting 3 papers including 2 spotlights. DM me if you want to chat about science of LMs, interpretability, and data. Check out details👇.

4

0

63

Nouha Dziri

@nouhadziri

1 year

📢Can LMs follow their own proposed rules? No!. LMs excel at rule generation but display puzzling inductive reasoning behavior, a sharp contrast to humans. 💫This work was led by my intern @Linlu who did a excellent job during the summer at Mosaic.💫. 📜

Linlu Qiu

@linluqiu

1 year

How good are LMs at inductive reasoning? How are their behaviors similar to/contrasted with those of humans?. We study these via iterative hypothesis refinement. We observe that LMs are phenomenal hypothesis proposers, but they also behave as puzzling inductive reasoners:. (1/n)

2

11

57

Nouha Dziri

@nouhadziri

3 years

📢 Excited to share our new work 💥. FaithDial: A Faithful Benchmark for Information-Seeking Dialogue. 📄 🌐 👩‍💻 joint work w. @sivareddyg, @PontiEdoardo, @ehsk0, @ozaiane, Mo Yu, Sivan Milton.#NLProc.

2

18

60

Nouha Dziri

@nouhadziri

6 years

Super excited to present our paper "Evaluating Coherence in Dialogue Systems using Entailment" at #NAACL2019 with my great collaborators @ehsk0, @korymath and @ozaiane. What a great feeling to head into the weekend with :) @ualbertaScience @AmiiThinks @NAACLHLT.

5

8

57

Nouha Dziri

@nouhadziri

2 years

📢📢 Happy to announce our newest work: Self-Refine. LLMs can now improve without any additional training data, RL or human intervention! .They repeatedly refine their outputs through self-feedback, improving performance on 7 tasks. 📜 Paper:

Aman Madaan

@aman_madaan

2 years

Can LLMs enhance their own output without human guidance? In some cases, yes! With Self-Refine, LLMs generate feedback on their work, use it to improve the output, and repeat this process. Self-Refine improves GPT-3.5/4 outputs for a wide range of tasks.

2

3

56

Nouha Dziri

@nouhadziri

2 months

Come to the #NeurIPS2024 inference-time scaling tutorial and join us for the panel for insightful and 🌶️🌶️ discussions . ⏰ when? Tue. Dec 10 @ 1:30 PM ET. This is so exciting!.

Sean Welleck

@wellecks

2 months

We're incredibly honored to have an amazing group of panelists: @agarwl_ , @polynoamial , @BeidiChen, @nouhadziri, @j_foerst , with @uralik1 moderating. We'll close with a panel discussion about scaling, inference-time strategies, the future of LLMs, and more!

0

8

52

Nouha Dziri

@nouhadziri

2 months

Super excited today for the System 2 Reasoning at Scale workshop, come join us to discover how to equip AI systems with reasoning that's optimized for renewable energy and not fossil fuel 🔥🚀. ⏰When? today, 9am-5:30pm .📍West Ballroom B. #NeurIPS2024

2

6

51

Nouha Dziri

@nouhadziri

2 years

Joint work with @GXiming @melaniesclar @xiang_lorraine @liweijianglw @billyuchenlin @PeterWestTM Chandra Bhagavatula @Ronan_LeBras @ssanyal8 @wellecks @xiangrenNLP @AllysonEttinger Zaid Harchaoui @YejinChoinka.

2

1

44

Nouha Dziri

@nouhadziri

3 years

Excited about this talk tomorrow, come join!!.

Stanford NLP Group

@stanfordnlp

3 years

At tomorrow's NLP Seminar, we are delighted to host Nouha Dziri (@nouhadziri), who will be talking about her work toward building faithful conversational models. Join us over zoom tomorrow at 11 am PT. Registration: Abstract:

0

4

46

Nouha Dziri

@nouhadziri

1 year

Ok after a hectic start to the year with all the deadlines, happy to share that I have 3 papers accepted at ICLR, 1 Oral 🥳and 2 posters. So see you in Viennaa🎵🎻🎹. See details 👇.

6

0

44

Nouha Dziri

@nouhadziri

4 months

🚀🚀Super excited that WildTeaming got accepted at #neurips2024 (main track)🎉🎉🎉🎉🎉. See you in Vancouver to discuss LLMs safety, red-teaming, reasoning and more. Also stay tuned for a safety blogpost coming out early next week🚀🚀 #LLMs.

Liwei Jiang

@liweijianglw

7 months

We introduce 🦁WildTeaming🦁, an automatic red-team framework to compose human-like adversarial attacks using diverse jailbreak tactics devised by creative and self-motivated users in-the-wild. (2/N)

2

5

43

Nouha Dziri

@nouhadziri

4 years

@lexfridman This is not a conflict, stop white-washing and hiding Israel's war crimes. How can you describe it as a conflict when heavily armed police forces are murdering, bombing unarmed Palestinians?!!.

5

2

39

Nouha Dziri

@nouhadziri

6 years

New work "Evaluating Coherence in Dialogue Systems using Entailment" to appear at @NAACLHLT 2019 @ehsk0 @korymath @ozaiane. Our approach measures coherence in dialogs allowing an unbiased and a fast evaluation. Coda/data: Paper:

0

4

39

Nouha Dziri

@nouhadziri

4 years

😂😂.

geist

@derGeistbot

4 years

the abstract the paper

1

3

37

Nouha Dziri

@nouhadziri

1 year

Happening now. Come chat about fine-grained RLHF!!. Hall B2.#324

Nouha Dziri

@nouhadziri

1 year

Fine-Grained Human Feedback Gives Better Rewards for Language Model Training (Spotlight). 🗓️ Tue 12 Dec 6:15 pm - 8:15 pm EST.📍 Great Hall & Hall B1+B2.🆔 #324.

1

4

37

Nouha Dziri

@nouhadziri

2 months

Come to the inference-time tutorial today🔥. ⏰When? Today at 1:30pm PST.📍Where? West Exhibition Hall C. #NeurIPS2024

Nouha Dziri

@nouhadziri

2 months

Come to the #NeurIPS2024 inference-time scaling tutorial and join us for the panel for insightful and 🌶️🌶️ discussions . ⏰ when? Tue. Dec 10 @ 1:30 PM ET. This is so exciting!.

3

6

37

Nouha Dziri

@nouhadziri

2 years

@ThomasMiconi Hi Thomas, the primary focus of Zhou et al. is to explore possibilities for improving the model's performance without necessarily aiming for complete mastery of the task. While they have indeed made enhancements compared to baseline approaches, they did not solve the task on OOD.

1

0

36

Nouha Dziri

@nouhadziri

4 months

🔔New work on measuring the risks of human .over-reliance on LLM expressions of uncertainty. The more capable LLMs the better they are at deceiving people. We introduce an evaluation framework that measures when humans rely on LLM generations. 👇🧵.w. @stanfordnlp @allen_ai CMU

Kaitlyn Zhou ✈️ CSCW, EMNLP!

@KaitlynZhou

4 months

How can we best measure the consequences of LLM overconfidence?. ✨New preprint✨ on measuring the risks of human over-reliance on LLM expressions of uncertainty: w/@JenaHwang2 @xiangrenNLP @nouhadziri @jurafsky @MaartenSap @stanfordnlp @allen_ai #NLPproc

1

3

36

Nouha Dziri

@nouhadziri

2 months

We’ve kicked off the workshop! Join us in 📍West Ballroom B, and don’t forget to share your burning questions for the panel here:. 🌶️🔥Questions: #NeurIPS2024.

Nouha Dziri

@nouhadziri

2 months

Super excited today for the System 2 Reasoning at Scale workshop, come join us to discover how to equip AI systems with reasoning that's optimized for renewable energy and not fossil fuel 🔥🚀. ⏰When? today, 9am-5:30pm .📍West Ballroom B. #NeurIPS2024

0

7

35

Nouha Dziri

@nouhadziri

4 months

⏳⏳⏳Only 3 days left to submit your reasoning work to our "System 2 Reasoning At Scale" workshop. #NeurIPS24. ⏲️🚨🚨🚨Deadline: Sept 23, 2024, AOE.

Nouha Dziri

@nouhadziri

6 months

📢Super excited that our workshop "System 2 Reasoning At Scale" was accepted to #NeurIPS24, Vancouver! 🎉.🎯 how can we equip LMs with reasoning, moving beyond just scaling parameters and data?. Organized w. @stanfordnlp @MIT @Princeton @allen_ai @uwnlp . 🗓️ when? Dec 15 2024

1

4

35

Nouha Dziri

@nouhadziri

2 months

l had the pleasure yesterday to lead the agentic AI safety sessions in @mozilla @columbia safety convening with @sayashk and meet with the🇫🇷French Minister of AI, Madame Clara Chappaz with whom I discussed the urgency of regulatory frameworks to enable safe deployment of agents👇

1

33

Nouha Dziri

@nouhadziri

2 years

Excited to be at #ACL2023NLP Toronto in person🇨🇦 I'm presenting 2 papers🥳: 1 TACL poster on Monday, 1 oral on Tuesday. Happy to meet and chat!!. Here are some details:.

1

0

33

Nouha Dziri

@nouhadziri

2 months

Come to the Language Gamification workshop to learn about multi-agent learning, LLMs search, inference-time algorithms, the talks are really high quality and the speakers are top-notch🔥. ⏰When? today .📍Where? West Meeting Room 220-222. #NeurIPS2024

Nouha Dziri

@nouhadziri

2 months

Come to my talk about in-context learning, inference-time algorithms and limits of Transformers LLMs in reasoning tasks🔥. ⏰When? today Dec 14.📍Where? West Meeting Room 220-222.

1

7

33

Nouha Dziri

@nouhadziri

6 years

Heading to Florence for @ACL2019_Italy 🇮🇹. Excited to meet great researchers and to explore the sheer amount of inspirational NLP papers in different areas. Ping me if you want to chat about dialogue :).

0

28

Nouha Dziri

@nouhadziri

2 years

📢Delayed announcement: happy to share that FaithDial has been accepted at TACL 🚀🥳 Please consider using this dataset if you want to avoid hallucination in your response generation. Paper: .Dataset: Code:

Nouha Dziri

@nouhadziri

3 years

📢 Excited to share our new work 💥. FaithDial: A Faithful Benchmark for Information-Seeking Dialogue. 📄 🌐 👩‍💻 joint work w. @sivareddyg, @PontiEdoardo, @ehsk0, @ozaiane, Mo Yu, Sivan Milton.#NLProc.

5

8

30

Nouha Dziri

@nouhadziri

2 years

Congratulations @ai2_mosaic🥳🚀🔥🔥 1 best paper award and 3 outstanding papers at #ACL2023NLP .Congrats to all the authors!!! @jmhessel @YejinChoinka @MaartenSap @melaniesclar @xiang @PeterWestTM @alsuhr @jenahwan et al.!!.

Ai2

@allen_ai

2 years

Congratulations to the team, including AI2ers, who worked on "Do Androids Laugh at Electric Sheep? Humor 'Understanding' Benchmarks from The New Yorker Caption Contest" — selected for a Best Paper Award at #ACL2023!

1

0

30

Nouha Dziri

@nouhadziri

3 years

📢 Excited to share our new work *Neural Path Hunter* (NPH) #EMNLP2021 . Paper: NPH focuses on enforcing faithfulness in KG-grounded dialogue systems by refining hallucinations via queries over a k-hop subgraph [1/6]. w @bose_joey @AndreaMadotto @ozaiane

3

5

29

Nouha Dziri

@nouhadziri

11 months

Reward models are the essence of success in RLHF, yet there has been little focus on evaluating them 😬.We introduce RewardBench💥 the first benchmark for reward models. We evaluated 30+ of the existing RMs (w/ DPO) and created new datasets. Discover lots of insightful analyses👇.

Nathan Lambert

@natolambert

11 months

Excited to share something that we've needed since the early open RLHF days: RewardBench, the first benchmark for reward models. 1. We evaluated 30+ of the currently available RMs (w/ DPO too). 2. We created new datasets covering chat, safety, code, math, etc. We learned a lot.

0

28

Nouha Dziri

@nouhadziri

1 year

📢🔥New work: We introduce a dataset (*ValuePrism*) and model (*Kaleido*) to engage AI with human values, rights, and duties in joint CS + Philosophy work with many wonderful people. Paper: .Demo:

Taylor Sorensen

@ma_tay_

1 year

Human values are crucial to human decision-making. We may consider health or environmental responsibility when deciding to bike to work, or weigh loyalty against honesty when deciding if we should lie to protect a friend. (0/n)

1

7

30

Nouha Dziri

@nouhadziri

9 months

Super excited be in Vienna #ICLR2024 next week 🎻🎵🎹😍 Will be presenting 1 oral and 2 posters. Please DM to talk about these works and also *SAFETY* in LMs. Details👇.

1

29

Nouha Dziri

@nouhadziri

2 years

🚀🚀 Super happy that our work got accepted at #ACL2023NLP main conference. TDLR; we highlight the pitfalls (rigidity of lexical matching) and brittleness (inability to spot hallucination) of existing evaluation mechanisms in information-seeking QA.🔥🔥.See you all in Toronto🥳.

Ehsan Kamalloo

@ehsk0

2 years

🔥🔥 Our work "Evaluating Open-Domain Question Answering in the Era of Large Language Models" w/ @nouhadziri @claclarke @DavoodRafiei got accepted to the #ACL2023NLP main conference. Stay tuned 🎸 for more details!. #ACL2023 #ACL2023Toronto #NLProc.

0

1

29

Nouha Dziri

@nouhadziri

2 months

Come to hear about WildGuard🔥. 📍East Exhibit Hall A-C #4208

Nouha Dziri

@nouhadziri

2 months

Present at the main conf the safety toolkit with the amazing team:. * WildGuard: ⏰12 Dec 4:30-7:30 PM.* WildTeaming: ⏰13 Dec 4:30-7:30 PM. Use our tools:.⚔️Moderation Tools: 🦁 Red-teaming: 🔎Evaluation: 👇.

1

2

28

Nouha Dziri

@nouhadziri

5 years

End of a wonderful experience at @MSFTResearch Montreal with a super talented and amazing team @shikhar_warlock @murefil @temporaer and Goeff Gordon !! Looking forward to the next steps and to the surprises the new year is bringing! Happy holidays everyone🎉

0

3

27

Nouha Dziri

@nouhadziri

1 year

@natolambert @hamishivi This aligns with our recent work "Fine-Grained RLHF" which will be presented at #NeurIPS2023 led by the great @zeqiuwu1 .A framework that enables training and learning from different reward functions (factual incorrectness, irrelevance, toxicity, etc).

1

3

27

Nouha Dziri

@nouhadziri

4 years

📢 Excited to share our paper *DEMI* #ICML2021. Paper: DEMI aims to decompose a hard estimation problem into a smaller subproblems that can be potentially solved with less bias. w/ @murefil, @temporaer, @RemiTachet,@philip_bachman, Goeff Gordon [1/n].

3

4

26

Nouha Dziri

@nouhadziri

2 months

Come to my talk about in-context learning, inference-time algorithms and limits of Transformers LLMs in reasoning tasks🔥. ⏰When? today Dec 14.📍Where? West Meeting Room 220-222.

Language Gamification Workshop @ NeurIPS 2024

@MLLanguageGames

2 months

Talk 1: In-context Learning in LLMs: Potential and Limits. 🎤 @nouhadziri .🕕 08:30-09:10. 🧵2/8.#NeurIPS2024

2

26

Nouha Dziri

@nouhadziri

5 years

✈️Heading to San Francisco to give a talk about evaluating open-ended dialogue systems🤖 at the #RasaDevSummit. Check out the exciting agenda and the awesome speakers here: @Rasa_HQ

0

4

26

Nouha Dziri

@nouhadziri

3 years

@SashaMTL @emnlpmeeting People from the Middle East receive all sorts of discrimination and racism. They're even banned to enter the US and if they receive a study visa, they won't be able to leave the country or see their families. Why no one speaks about discrimination against them?.

2

1

24

Nouha Dziri

@nouhadziri

3 years

I'm deeply saddened to know that Canadian universities are banning international travel to conferences. Knowing that #EMNLP2021 will be in-person this year was what kept us (me and many friends) motivated to do research more enthusiastically.

1

3

25

Nouha Dziri

@nouhadziri

5 years

Had so much fun today presenting a talk about dialogue evaluation at @DeepMindAI Montreal :) Thanks @korymath for being such a great host !!!.

0

1

25

Nouha Dziri

@nouhadziri

1 year

Stop by our "Faith and Fate" poster tomorrow 11:45 am - 1:45 pm EST . 📍 Great Hall & Hall B1+B2 .🆔#421. Details:

Nouha Dziri

@nouhadziri

1 year

Faith and fate paper: Limits of Transformers on Compositionality (Spotlight). 🗓️ Tue 12 Dec 11:45 am - 1:45 pm EST .📍 Great Hall & Hall B1+B2 .🆔 #421.

0

2

25

Nouha Dziri

@nouhadziri

3 years

@edmontonpolice We're having a nightmare in downtown. You MUST stop them from honking and disturbing our lives. We have work to do and we can't afford another chaotic Saturday.

1

0

24

Nouha Dziri

@nouhadziri

3 months

📢⚠️ We're offering a few free NeurIPS tickets for students who have papers accepted and lack financial resources to attend. Please fill up this form asap:

Shikhar

@ShikharMurty

3 months

⚠️Attention NeurIPS attendees (and especially students with papers accepted at the System-2 Reasoning workshop!). We have a few free NeurIPS tickets available! If you're an enrolled student and would like to be considered, please fill out this form:

0

4

24

Nouha Dziri

@nouhadziri

2 years

@ThomasMiconi Our findings reveal that accomplishing this is not a simple task, and we offer insights into why reaching full mastery is inherently challenging.

2

0

24

Nouha Dziri

@nouhadziri

1 year

Come learn about Self-Refine. happening now. ID: 405.Hall B1 & B2

Nouha Dziri

@nouhadziri

2 years

📢📢 Happy to announce our newest work: Self-Refine. LLMs can now improve without any additional training data, RL or human intervention! .They repeatedly refine their outputs through self-feedback, improving performance on 7 tasks. 📜 Paper:

0

5

23

Nouha Dziri

@nouhadziri

2 years

@ThomasMiconi In contrast, our work delves into investigating the fundamental limits of achieving full mastery of the task. We seek to examine whether we could achieve 100% performance in both in-domain and OOD settings by pushing transformers to their limits.

2

0

23

Nouha Dziri

@nouhadziri

6 years

#NAACL19 friends: if you're interested in knowing more about evaluating dialog, I'll be talking about "Evaluating Coherence in Dialogue Systems using Entailment" on Wednesday in Hyatt Exhibit Hall at 1:30 PM. This is a joint work with @ehsk0 , @korymath, and @ozaiane.

0

3

22

Nouha Dziri

@nouhadziri

11 months

Why LMs perform better when we "motivate" them or ask them "nicely"? 🤔Would this and certain prompting tricks lead to bypassing safety guards? YES ✅.Discover some hypotheses in this piece:.

Ai2

@allen_ai

11 months

Prompts heavily influence model outputs — and can bypass safety measures to trigger harmful results. @nouhadziri spoke with @Kyle_L_Wiggers about chatbot prompts for @TechCrunch:.

0

2

21

Nouha Dziri

@nouhadziri

3 years

Super excited to be attending in person #NAACL2022. Please reach out if you're interested in conversational AI and trustworthiness. I will be also presenting a poster on this topic on Monday from 2:30pm to 4pm (PDT) at Regency A & B.

Nouha Dziri

@nouhadziri

3 years

We recently show that existing knowledge-grounded dialogue benchmarks (e.g., Wizard of Wikipedia; WoW) suffer from hallucination at an alarming level (>60% of responses) . We perform a comprehensive linguistic analysis in our new #NAACL22 work. 📄

1

0

22

Nouha Dziri

@nouhadziri

4 months

Multimodal Olmo is out, fully open!! Congrats to the team🎉.- Beating giant models: e.g., Claude 3.5 Sonnet.- Scaling era is somehow stagnating .- high-quality data + alignment algorithms + test-time decoding are the secret sauce for a powerful model.

Ai2

@allen_ai

4 months

Meet Molmo: a family of open, state-of-the-art multimodal AI models. Our best model outperforms proprietary systems, using 1000x less data. Molmo doesn't just understand multimodal data—it acts on it, enabling rich interactions in both the physical and virtual worlds. Try it

0

21

Nouha Dziri

@nouhadziri

2 years

Happening now at Frontenac Ballroom (Board 2) #ACL2023NLP

Nouha Dziri

@nouhadziri

3 years

📢 Excited to share our new work 💥. FaithDial: A Faithful Benchmark for Information-Seeking Dialogue. 📄 🌐 👩‍💻 joint work w. @sivareddyg, @PontiEdoardo, @ehsk0, @ozaiane, Mo Yu, Sivan Milton.#NLProc.

0

5

21

Nouha Dziri

@nouhadziri

9 months

Come listen to @linluqiu talking about the puzzling behavior of LLMs in inductive reasoning and iterative hypothesis refinement. Hall A8-9 #ICLR2024 🔥

Nouha Dziri

@nouhadziri

9 months

Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement (Oral). 🗓️ (Talk) May 10 a.m. CEST — 10:45 a.m. CEST [Oral 3A].🗓️ (Poster) Wed 8 May 10:45 a.m. CEST — 12:45 p.m. CEST.📍Halle B.

0

21

Nouha Dziri

@nouhadziri

2 months

🚀New work🔥CREATIVITY Index🔥 .This work is SO close to my heart, I loved every part of the experiments. It provided me with much scientific fulfillment. Intellectual works have become so rare in the hysterical race of AI, so if you care about science give this work a read🧵

Ximing Lu

@GXiming

2 months

Are LLMs 🤖 as creative as humans 👩‍🎓? Not quite!. Introducing CREATIVITY INDEX: a metric that quantifies the linguistic creativity of a text by reconstructing it from existing text snippets on the web. Spoiler: professional human writers like Hemingway are still far more creative

1

2

21

Nouha Dziri

@nouhadziri

2 years

#emnlp2022 here I come!!! A verryyy long flight ✈️✈️, but excited to see many people and chat!!.

0

1

20

Nouha Dziri

@nouhadziri

2 years

📢🚀🚀🥳 Excited to share our new work led by the fantastic @ndaheim_. Elastic Weight Removal for Faithful and Abstractive Dialogue Generation. 📄 👩‍💻 . joint work w. @PontiEdoardo, @IGurevych @mrinmayasachan.#NLProc.

Edoardo Ponti

@PontiEdoardo

2 years

Large language models often generate hallucinated responses. We introduce Elastic Weight Removal (EWR), a novel method for faithful *and* abstractive dialogue. 📃💻+other methods!.🧑‍🔬@ndaheim_ @nouhadziri @IGurevych @mrinmayasachan.

1

2

18

Nouha Dziri

@nouhadziri

3 years

@LucianaBenotti The remote experience at #acl2022nlp was terrible. As someone attending from north america, I can barely attend anything live. Tutorials, workshops, some keynote talks were not uploaded in underline and they are still not. Organizers don't reply.

4

0

19

Nouha Dziri

@nouhadziri

2 months

#NeurIPS2024 remains my favorite venue, excited to be there next week🥳Reach out to chat, I will . *Speak at the panel of the Inference-Time Algorithms tutorial ⏰Dec 10 . *Give a talk at the Language Gamification workshop: in-context learning /scaling inference ⏰Dec 14. More👇.

1

19

Nouha Dziri

@nouhadziri

1 year

Faith and fate paper: Limits of Transformers on Compositionality (Spotlight). 🗓️ Tue 12 Dec 11:45 am - 1:45 pm EST .📍 Great Hall & Hall B1+B2 .🆔 #421.

Nouha Dziri

@nouhadziri

2 years

🚀📢 GPT models have blown our minds with their astonishing capabilities. But, do they truly acquire the ability to perform reasoning tasks that humans find easy to execute? NO⛔️. We investigate the limits of Transformers *empirically* and *theoretically* on compositional tasks🔥

1

17

Nouha Dziri

@nouhadziri

3 years

Is #NAACL2022 not using Whova app this year??? if yes, what's the alternative?.

0

1

16

Nouha Dziri

@nouhadziri

4 months

Very cool work! This is a reminder again that scoring high on a math benchmark does not mean necessarily that LMs can truly "reason" or "think". It validates "faith and fate" observations for skeptics🙂.You may think of pattern-matching as one type of reasoning which can indeed👇.

Mehrdad Farajtabar

@MFarajtabar

4 months

1/ Can Large Language Models (LLMs) truly reason? Or are they just sophisticated pattern matchers? In our latest preprint, we explore this key question through a large-scale study of both open-source like Llama, Phi, Gemma, and Mistral and leading closed models, including the

1

17

Nouha Dziri

@nouhadziri

2 years

The sad part is I didn't get the chance to commemorate the moment by taking photos, I even forgot to take screenshots :( Anyways, it's been a rollercoaster of emotions, the journey was not easy, but I'm happy about the things I've achieved!!.

1

17

Nouha Dziri

@nouhadziri

4 years

Please be nice and have good-faith in authors! Stop destructive and mean reviews!!.

0

14

Nouha Dziri

@nouhadziri

7 years

Pionners of #DeepLearning and #ReinforcementLearning Yoshua Bengio, Richard Sutton and Goeffrey Hinton together!! What a GREAT panel talk! #DLRL @VectorInst @AmiiThinks @MILAMontreal

1

2

17

Nouha Dziri

@nouhadziri

6 years

#NeurIPS2018 has launched! Starting off by attending #GoogleAI talk about Machine Learning fairness.

0

3

15

Nouha Dziri

@nouhadziri

2 years

A huge shoutout to my awesome supervisor @ozaiane, to my wonderful mentor and examiner @alonamarie, to the gentle Angel Chang, and to my external committee Jackie Cheung and Colin Cherry!.

1

0

16

Nouha Dziri

@nouhadziri

1 year

Congrats to all the authors 🥳🔥another outstanding paper from @ai2_mosaic @allen_ai.