Haoyi Qiu @HaoyiQiu profile

Haoyi Qiu

@HaoyiQiu

Followers

857

Following

841

Statuses

146

Research intern @SFResearch ☁️ PhD student @UCLANLP 🧸 BS in CS&Math @UMich 〽️ #NLP 🌷

Los Angeles, CA

Joined October 2018

Don't wanna be here? Send us removal request.

Haoyi Qiu

@HaoyiQiu

21 days

Thrilled to share that CASA has been accepted to @naaclmeeting #NAACL2025 (Findings)! 🎉 Can’t wait to see you all in Albuquerque! 🌟 As I wrap up 2024, my first year as a PhD student (started April 2024), I’m overwhelmed with gratitude. This year has been a journey of growth, discovery, and resilience. From publishing 5 papers across NAACL, NeurIPS, ACL, ACM MM, and TKDE to exploring fascinating topics like multimodal hallucination, factuality, safety, and culturally aware agents—every step has been shaped by brilliant collaborators, mentors, and endless support. Here’s to staying humble, working hard, and embracing the opportunities 2025 holds. Let’s keep moving forward! 💪

Haoyi Qiu

@HaoyiQiu

3 months

🌐 Are LLM agents prepared to navigate the rich diversity of cultural and social norms? 🏠 CASA tests them on real-world tasks like online shopping and social discussion forums, revealing that current agents show less than 10% awareness and over 40% norm violations. 🧠 We’re bridging this gap by combining fine-tuning on regional data with strategic prompts to create agents that better understand our world’s diversity. Read the full paper for all insights! 📑 Grateful for the incredible team at Salesforce AI Research @salesforce !

0

4

38

Haoyi Qiu

@HaoyiQiu

4 days

@PranavVenkit Congratulations 🎉

1

0

1

Haoyi Qiu

@HaoyiQiu

16 days

RT @SFResearch: 🔬Advanced agent systems, RAG evaluation, instruction-following and more. Our team's accepted papers at #NAACL2025 span from…

0

8

0

Haoyi Qiu

@HaoyiQiu

21 days

RT @steeve__huang: Excited to share that CRMArena has been accepted by #NAACL2025 @naaclmeeting. See you in Albuquerque🚡! @SFResearch

0

8

0

Haoyi Qiu

@HaoyiQiu

28 days

RT @webagentlab: 5⃣️ Evaluating Cultural and Social Awareness of LLM Web Agents Haoyi Qiu @HaoyiQiu, Alexander R. Fabbri @alexfabbri4 , Di…

0

2

0

Haoyi Qiu

@HaoyiQiu

1 month

RT @WeijiangLi2: 1/6 📢 Excited to share our new paper on using language models to classify genetic variants! ClinVar-BERT helps prioritize…

0

2

0

Haoyi Qiu

@HaoyiQiu

2 months

RT @VioletNPeng: I’m grateful for the enormous support from the community! It’s an honor to serve, and I’m excited to work hard alongside a…

0

8

0

Haoyi Qiu

@HaoyiQiu

2 months

RT @sarahookr: We have released Global-MMLU-lite 🔥 This is designed to run more efficiently while giving a good estimate of overall perfor…

0

24

0

Haoyi Qiu

@HaoyiQiu

2 months

RT @AlexanderSpangh: ✨✨✨Hello everyone, I’m on the faculty job market this year.✨✨✨ I’m completing my PhD at USC, where I study agentic pla…

0

21

0

Haoyi Qiu

@HaoyiQiu

2 months

RT @srush_nlp: This year, I have an exceptional student on the academic market. Wenting Zhao (@wzhao_nlp) builds systems that reason in na…

0

92

0

Haoyi Qiu

@HaoyiQiu

2 months

RT @tyao923: 🧐How can agents effectively learn skill prompting, planning, and maximizing rewards from large amounts of unlabeled data? 😉Com…

0

7

0

Haoyi Qiu

@HaoyiQiu

2 months

RT @zy27962986: 🚀🚀🚀Want to develop a cutting-edge video generation model towards Sora? Please dive into Apple’s latest recipe and studies f…

0

46

0

Haoyi Qiu

@HaoyiQiu

2 months

RT @Wade_Yin9712: A behavior is safe in country A, but may be unsafe in country B. Check out our #NeurIPS2024 SafeWorld! It evaluates ho…

0

6

0

Haoyi Qiu

@HaoyiQiu

2 months

RT @ZCJW2021: 🔥Thrilled to share our #NeurIPS2024 paper, “JourneyBench⚖️: A Challenging One-Stop Vision-Language Understanding Benchmark of…

0

10

0

Haoyi Qiu

@HaoyiQiu

2 months

RT @steeve__huang: Do LLMs know the 🧑‍🤝‍🧑 cultural and ⚖️ legal safety across our globe? Our #NeurIPS2024 paper 🌍 SafeWorld dives into th…

0

7

0

Haoyi Qiu

@HaoyiQiu

2 months

Just landed in Vancouver 🇨🇦 and I'm beyond excited for my first #NeurIPS and my very first ML conference! 🌱 I'll be presenting our 𝕊𝕒𝕗𝕖𝕎𝕠𝕣𝕝𝕕 paper on Wednesday, Dec 11th at East Exhibit Hall A-C from 11:00 AM to 2:00 PM (#3308). No need to skip lunch—we've got delicious snacks ready for you to enjoy while we chat! I'm also excited to dive into conversations on safety, alignment, cultural analytics, especially for LLMs, LVLMs, and agents. Stop by, grab a snack, and let's connect!

Haoyi Qiu

@HaoyiQiu

2 months

🌍Are LLMs aware of cultural and legal safety in today’s geo-diverse world? 🚀Introducing SafeWorld, our #NeurIPS2024 paper and benchmark assessing LLMs’ understanding of geo-diverse safety, based on cultural norms and policies across 50 countries and 493 regions/races. ⚖️We also propose a multi-dimensional framework for evaluating contextual appropriateness, accuracy, and comprehensiveness, revealing major gaps in current LLMs. 🧨To address this, we train SafeWorldLM using DPO, achieving SOTA performance and a 20% higher global human evaluator rating in helpfulness and harmfulness over competing models, including GPT-4o. 🔗Paper: 💻 GitHub: 🫶🏻This is a joint leading effort with @Wade_Yin9712. Also many thanks to the amazing team @steeve__huang @kaiwei_chang, and @VioletNPeng for their hard work. Check out more details and results we conclude from our paper in the thread below. 🧵

0

6

54

Haoyi Qiu

@HaoyiQiu

2 months

RT @ChujieZheng: Thrilled to introduce ProcessBench, our benchmark for measuring the ability to identify process errors in mathematical rea…

0

51

0

Haoyi Qiu

@HaoyiQiu

2 months

(6/n) This research was made possible through amazing collaborations between @uclanlp and @SFResearch. 🫶🏻This is a joint leading effort with @Wade_Yin9712. Also many thanks to the amazing team @steeve__huang, @kaiwei_chang, and @VioletNPeng for their hard work. 🌟 (n=6)

0

3

Haoyi Qiu

@HaoyiQiu

2 months

(5/n) We show that our SafeWorldLM model significantly outperforms competitors, including GPT-4o, across all evaluation dimensions!

0

4

Haoyi Qiu

@HaoyiQiu

2 months

(4/n) We've also developed SafeWorldLM, a model trained for outstanding geo-diverse safety alignment, outperforming even top proprietary models like GPT-4o by wide margins on all safety dimensions.

0

4