![Haoyi Qiu Profile](https://pbs.twimg.com/profile_images/1848506420808810496/FuqPWPfD_x96.jpg)
Haoyi Qiu
@HaoyiQiu
Followers
857
Following
841
Statuses
146
Research intern @SFResearch ☁️ PhD student @UCLANLP 🧸 BS in CS&Math @UMich 〽️ #NLP 🌷
Los Angeles, CA
Joined October 2018
Thrilled to share that CASA has been accepted to @naaclmeeting #NAACL2025 (Findings)! 🎉 Can’t wait to see you all in Albuquerque! 🌟 As I wrap up 2024, my first year as a PhD student (started April 2024), I’m overwhelmed with gratitude. This year has been a journey of growth, discovery, and resilience. From publishing 5 papers across NAACL, NeurIPS, ACL, ACM MM, and TKDE to exploring fascinating topics like multimodal hallucination, factuality, safety, and culturally aware agents—every step has been shaped by brilliant collaborators, mentors, and endless support. Here’s to staying humble, working hard, and embracing the opportunities 2025 holds. Let’s keep moving forward! 💪
🌐 Are LLM agents prepared to navigate the rich diversity of cultural and social norms? 🏠 CASA tests them on real-world tasks like online shopping and social discussion forums, revealing that current agents show less than 10% awareness and over 40% norm violations. 🧠 We’re bridging this gap by combining fine-tuning on regional data with strategic prompts to create agents that better understand our world’s diversity. Read the full paper for all insights! 📑 Grateful for the incredible team at Salesforce AI Research @salesforce !
0
4
38
RT @SFResearch: 🔬Advanced agent systems, RAG evaluation, instruction-following and more. Our team's accepted papers at #NAACL2025 span from…
0
8
0
RT @steeve__huang: Excited to share that CRMArena has been accepted by #NAACL2025 @naaclmeeting. See you in Albuquerque🚡! @SFResearch
0
8
0
RT @webagentlab: 5⃣️ Evaluating Cultural and Social Awareness of LLM Web Agents Haoyi Qiu @HaoyiQiu, Alexander R. Fabbri @alexfabbri4 , Di…
0
2
0
RT @WeijiangLi2: 1/6 📢 Excited to share our new paper on using language models to classify genetic variants! ClinVar-BERT helps prioritize…
0
2
0
RT @VioletNPeng: I’m grateful for the enormous support from the community! It’s an honor to serve, and I’m excited to work hard alongside a…
0
8
0
RT @sarahookr: We have released Global-MMLU-lite 🔥 This is designed to run more efficiently while giving a good estimate of overall perfor…
0
24
0
RT @AlexanderSpangh: ✨✨✨Hello everyone, I’m on the faculty job market this year.✨✨✨ I’m completing my PhD at USC, where I study agentic pla…
0
21
0
RT @srush_nlp: This year, I have an exceptional student on the academic market. Wenting Zhao (@wzhao_nlp) builds systems that reason in na…
0
92
0
RT @zy27962986: 🚀🚀🚀Want to develop a cutting-edge video generation model towards Sora? Please dive into Apple’s latest recipe and studies f…
0
46
0
RT @Wade_Yin9712: A behavior is safe in country A, but may be unsafe in country B. Check out our #NeurIPS2024 SafeWorld! It evaluates ho…
0
6
0
RT @ZCJW2021: 🔥Thrilled to share our #NeurIPS2024 paper, “JourneyBench⚖️: A Challenging One-Stop Vision-Language Understanding Benchmark of…
0
10
0
RT @steeve__huang: Do LLMs know the 🧑🤝🧑 cultural and ⚖️ legal safety across our globe? Our #NeurIPS2024 paper 🌍 SafeWorld dives into th…
0
7
0
Just landed in Vancouver 🇨🇦 and I'm beyond excited for my first #NeurIPS and my very first ML conference! 🌱 I'll be presenting our 𝕊𝕒𝕗𝕖𝕎𝕠𝕣𝕝𝕕 paper on Wednesday, Dec 11th at East Exhibit Hall A-C from 11:00 AM to 2:00 PM (#3308). No need to skip lunch—we've got delicious snacks ready for you to enjoy while we chat! I'm also excited to dive into conversations on safety, alignment, cultural analytics, especially for LLMs, LVLMs, and agents. Stop by, grab a snack, and let's connect!
🌍Are LLMs aware of cultural and legal safety in today’s geo-diverse world? 🚀Introducing SafeWorld, our #NeurIPS2024 paper and benchmark assessing LLMs’ understanding of geo-diverse safety, based on cultural norms and policies across 50 countries and 493 regions/races. ⚖️We also propose a multi-dimensional framework for evaluating contextual appropriateness, accuracy, and comprehensiveness, revealing major gaps in current LLMs. 🧨To address this, we train SafeWorldLM using DPO, achieving SOTA performance and a 20% higher global human evaluator rating in helpfulness and harmfulness over competing models, including GPT-4o. 🔗Paper: 💻 GitHub: 🫶🏻This is a joint leading effort with @Wade_Yin9712. Also many thanks to the amazing team @steeve__huang @kaiwei_chang, and @VioletNPeng for their hard work. Check out more details and results we conclude from our paper in the thread below. 🧵
0
6
54
RT @ChujieZheng: Thrilled to introduce ProcessBench, our benchmark for measuring the ability to identify process errors in mathematical rea…
0
51
0
(6/n) This research was made possible through amazing collaborations between @uclanlp and @SFResearch. 🫶🏻This is a joint leading effort with @Wade_Yin9712. Also many thanks to the amazing team @steeve__huang, @kaiwei_chang, and @VioletNPeng for their hard work. 🌟 (n=6)
0
0
3