Chenchen Ye Profile Banner
Chenchen Ye Profile
Chenchen Ye

@chenchenye_ccye

Followers
806
Following
840
Media
12
Statuses
28

CS PhD student @UCLA | Research Intern @Microsoft | Prev Undergrad @NUSingapore | LLM

Los Angeles, CA
Joined August 2022
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@chenchenye_ccye
Chenchen Ye
2 months
📢New LLM Agents Benchmark! Introducing 🌟MIRAI🌟: A groundbreaking benchmark crafted for evaluating LLM agents in temporal forecasting of international events with tool use and complex reasoning! 📜 Arxiv: 🔗 Project page: 🧵1/N
14
71
304
@chenchenye_ccye
Chenchen Ye
11 days
🚀 Excited to introduce our #ACL2024 Main Paper TCELongBench: Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding! 📰 As online news grows, the challenge of swiftly understanding complex events, spread across
Tweet media one
1
12
82
@chenchenye_ccye
Chenchen Ye
2 months
🧵2/N We released our code, data and an iteractive demo: 💻 GitHub Repo: 📁 Dataset: 📊 Interactive Demo Notebook:
Tweet media one
1
1
13
@chenchenye_ccye
Chenchen Ye
2 months
🧵11/N Sincere thanks to all amazing collaborators and advisors @acbuller , @Yihe__Deng , @HuangZi71008374 , @mingyu_ma , @Zhu_Yanqiao , and @WeiWang1973 for their invaluable advice and efforts! 🙏❤️
0
0
11
@chenchenye_ccye
Chenchen Ye
2 months
🧵 8/N Forecasting with Temporal Distance Our ablation study let agents predicts 1, 7, 30, and 90 days ahead. 📊Results: As days increases, F1📉and KL📈. Agent's accuracy drops for distant events. Longer ones anticipate trend shifts influenced by more factors and complexities.
Tweet media one
1
0
10
@chenchenye_ccye
Chenchen Ye
2 months
🧵 4/N Forecasting Task 🔮 Forecasting involves collecting essential historical data and performing temporal reasoning to predict future events. 📅 Example: Forecasting cross-country relations on 2023-11-18 using event and news information up to 2023-11-17.
Tweet media one
1
0
10
@chenchenye_ccye
Chenchen Ye
2 months
🧵 7/N Forecasting with Different Base LLMs 1️⃣ 📈 Code Block benefits stronger LLMs but hurts weaker models. 2️⃣ 🏆GPT-4o consistently outperforms other models. 3️⃣ 💪 Self-consistency makes a small model stronger.
Tweet media one
1
0
10
@chenchenye_ccye
Chenchen Ye
2 months
🧵 6/N Agent Framework 💡 Think: Agent analyzes and plans the next action using API specs. ⚡ Act: Generates Single Function or Code Block to retrieve data. 🚀 Execute: Python interpreter runs the code for observations. These steps are repeated until reaching a final forecast.
Tweet media one
1
0
10
@chenchenye_ccye
Chenchen Ye
2 months
🧵10/N Check our paper out for more details! 🌟 Code error analysis, different event types, variation of API types, and different agent planning strategies! Join us in advancing the capabilities of LLM agents in forecasting and understanding complex international events! 🚀
Tweet media one
1
0
10
@chenchenye_ccye
Chenchen Ye
2 months
🧵 9/N Tool-Use Ordering in Forecasting 🗂️Tool-Use Transition Graph: Agents start with recent events for key info and end with news for context. 🧠 Freq.(correct) - Freq.(incorrect): Highlight the need for strategic planning in LLM agents for effective forecasting.
Tweet media one
1
0
9
@chenchenye_ccye
Chenchen Ye
2 months
🧵 5/N APIs & Environment 💻 Our comprehensive APIs empower agents to generate code and access the database. 🔧 APIs include data classes and functions for various info types and search conditions. 🔄 Agents can call a single function or generate a code block at each step.
Tweet media one
1
0
9
@chenchenye_ccye
Chenchen Ye
2 months
🧵3/N Data 🌐With 59,161 unique events and 296,630 unique news articles, we curate a test set of 705 forecasting query-answer pairs. (a)📊 Circular Chart: The relation hierarchy and distribution in MIRAI. (b-c) 🔥 Heatmap: Intensity of global events, from conflict to mediation.
Tweet media one
1
0
9
@chenchenye_ccye
Chenchen Ye
11 days
Zhihan is at #ACL2024 in Bangkok to present our paper on new LLM benchmark TCELongBench for evaluating Temporal, Long Context Understanding! Catch her at Poster 📍Poster Session 1 ⏰ 8/12 at 11 AM (local time) For more details, check out our paper::
@zhihan72
Zhihan Zhang@ACL 2024
11 days
Excited to attend #ACL2024 in Bangkok! I will present our newest LLM benchmark **TCELongBench**: 💥 Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding 💥 Come to Poster Session 1 on 8/12 at 11:00AM!
Tweet media one
0
1
6
0
0
8
@chenchenye_ccye
Chenchen Ye
2 months
@nicolayr_ Thanks for sharing your thoughts, Nicolay! Your idea about forecasting from literature sounds really interesting!
1
0
2
@chenchenye_ccye
Chenchen Ye
2 months
@aviaviavi__ Thank you so much, Avi! Looking forward to hearing your thoughts!
0
0
1