Tao Yu Profile Banner
Tao Yu Profile
Tao Yu

@taoyds

Followers
3,439
Following
815
Media
34
Statuses
329
Explore trending content on Musk Viewer
@taoyds
Tao Yu
2 years
A new way to work w. LMs! Binder, an easy neuro-symbolic paradigm: 1.Parse input➡️SQL/Python bound w. GPT3 Codex API calls 2.Codex+PL interpreter execute➡️answer No train&few-shot!➡️SOTA 🆚chain-of-thought: interpretable&robust⬆️ 🆚NL2Code: coverage⬆️
Tweet media one
10
73
292
@taoyds
Tao Yu
2 years
💥New benchmark💥 DS-1000, a data science code generation benchmark with 1K questions about 7🐍libraries. Spent ~1200 expert hours! It is the only one that 1⃣ focuses on everyday applications 2⃣ includes natural intents & contexts 3⃣has test cases 1/🧵
Tweet media one
5
69
289
@taoyds
Tao Yu
11 months
🚀🚀🚀Lots of people working on LM agents recently! Open models like Llama/CodeLlama not quite up to ChatGPT's level? Our 🎉Lemur🎉- SOTA open foundation models for language agents, matching ChatGPT on🤖15 agent tasks🤖!
Tweet media one
@yihengxu_
Yiheng Xu
11 months
1/ 🧵 🎉 Introducing Lemur-70B & Lemur-70B-Chat: 🚀Open & SOTA Foundation Models for Language Agents! The closest open model to GPT-3.5 on 🤖15 agent tasks🤖! 📄Paper: 🤗Model @huggingface : More details 👇
Tweet media one
6
74
289
1
59
226
@taoyds
Tao Yu
3 years
📣UnifiedSKG: Lots of #NLProc researchers separately study tasks that link text to structured knowledge (Table/DB/KB..). We unify 21 such tasks into a Seq2Seq format with T5 to foster idea sharing&multitasking, performing very competitive! Paper&Code: 👇
Tweet media one
3
40
207
@taoyds
Tao Yu
1 year
In Memory of My beloved Ph.D. Advisor @dragomir_radev 🕯️R.I.P. 🕯️
Tweet media one
Tweet media two
@hmkyale
Harlan Krumholz
1 year
The #AI community, the #computerscience community, the @YaleSEAS community, and humanity have suddenly lost a remarkable person, @dragomir_radev - kind and brilliant, devoted to his family and friends... gone too soon. A sad day @Yale @YINSedge @YaleCompsci #NLP2023
Tweet media one
Tweet media two
41
87
389
6
12
198
@taoyds
Tao Yu
11 months
Beyond our Lemur: OPEN LMs for language agents Introducing 💥OpenAgents💥: an OPEN platform for language agents in the wild! Analyze data, call plugins, control your browser as ChatGPT Plus, but with OPEN SOURCE code!! 📑: Code:
Tweet media one
@ChengZhoujun
Zhoujun (Jorge) Cheng
11 months
💥OpenAgents💥: an OPEN platform for language agents in the wild Analyze data, call plugins, control your browser as ChatGPT Plus, but with OPEN Code for 1⃣Easy deployment 2⃣Full stack 3⃣Chat Web UI 4⃣Agent methods 5⃣… Code: 👇
5
70
223
4
62
183
@taoyds
Tao Yu
1 year
After 5 month dedicated work from >15 researchers & developers, we're thrilled to introduce 🚀OPEN-SOURCE language model Agents🚀! Try demos: 🥑 Stay tuned for open-source code, model, framework, evaluation & more at !
@XLangNLP
XLang NLP Lab
1 year
1/6🚀Announcing XLang language model (LM) Agents: 📊Data Agent: LM + code & data tools 🔧Plugins Agent: LM + 200+ API plugins 🌐Web Agent: LM + web control Try demo: Stay tuned for open-source code & models See more examples!👇
1
24
69
6
50
177
@taoyds
Tao Yu
5 months
🚀Multimodal agents is on rise in 2024! But even building app/domain-specific agent env is hard😰. Our real computer OSWorld env allows you to define agent tasks about arbitrary apps on diff. OS w.o crafting new envs. 🧐Benchmarked #VLMs on 369 OSWorld tasks: #GPT4V >> #Claude3
Tweet media one
@TianbaoX
Tianbao Xie
5 months
🤔Can we assess agents across various apps & OS w.o. crafting new envs? OSWorld🖥️: A unified, real computer env for multimodal agents to evaluate open-ended computer tasks with arbitrary apps and interfaces on Ubuntu, Windows, & macOS. + annotated 369 real-world computer tasks
5
53
181
6
37
155
@taoyds
Tao Yu
7 months
🚀Instructor🚀embeddings recently hit 2M downloads on @huggingface ! Now, excited to introduce 🚀GritLM🚀, the first SINGLE LM achieving SoTA in BOTH text embedding (MTEB) & generative tasks (BBH etc)! Great team effort w. @Muennighoff & @hongjin_su ! 📰: 👇
Tweet media one
@Muennighoff
Niklas Muennighoff
7 months
Introducing GRIT🦾to unify text embedding 🔢& generation 📝. GritLM is open SoTA on embedding (MTEB) & generative tasks (BBH etc.) – Both in 1 model. See 🧵for how GRIT🦾 makes RAG >60% faster & more 📜 💻 1/12
Tweet media one
10
139
569
2
34
134
@taoyds
Tao Yu
2 years
📢📢 Play with our Binder demo: ! Binder: an easy but sota neural-symbolic built on GPT-3 Codex & SQL/Python interpreter. Inject GPT-3 Codex prompt API calls in programming languages!
@taoyds
Tao Yu
2 years
A new way to work w. LMs! Binder, an easy neuro-symbolic paradigm: 1.Parse input➡️SQL/Python bound w. GPT3 Codex API calls 2.Codex+PL interpreter execute➡️answer No train&few-shot!➡️SOTA 🆚chain-of-thought: interpretable&robust⬆️ 🆚NL2Code: coverage⬆️
Tweet media one
10
73
292
2
22
129
@taoyds
Tao Yu
3 years
Life update: Thrilled to join @HKUniversity 🇭🇰as an asst. prof. and build the HKU #NLProc lab() with @ikekong . We have multiple openings for PhD/RA👨‍🔬! Come and visit us if you’re ever in HK🏙! Also, I’ll spend a year at @uwnlp working with @nlpnoah & Mari!
Tweet media one
Tweet media two
20
13
128
@taoyds
Tao Yu
6 months
Using LLMs for coding in new or evolving languages? We introduce: 1⃣new code generation benchmark that MUST consult code docs/tutorials 2⃣new multi-hop code generation method actively retrieving diverse resources: 28%📈 ChatGPT & 23.8%📈 in CodeLLama! 👇
Tweet media one
@hongjin_su
Hongjin Su
6 months
How to adapt LLMs for code 🖥️ to updated libraries and long-tail programming languages w/o training? 🤔 We introduce Arks ⛵️, Active Retrieval in Knowledge Soup, a general pipeline of retrieval-augmented generation for code (RACG). It features: 1️⃣A diverse knowledge soup
Tweet media one
2
32
111
0
20
126
@taoyds
Tao Yu
11 months
Exciting to see the rise interest in 🎉LLM + Code + Robotics + RL🎉! This year, multiple concurrent work for text to RL reward code generation for robot control: Happy to see this interdisciplinary effort!
@TianbaoX
Tianbao Xie
11 months
@DrJimFan Congrats Jim and your team for this fantastic work!! 🌟 Our team has also delved into a similar direction, leveraging LLM to automate the generation of dense reward code functions. Hope it can also provide insights to the community! 🔗 Project: 📄 Paper:
Tweet media one
3
14
93
1
19
82
@taoyds
Tao Yu
1 year
We just open-sourced 🚀 #Lemur70B ! 🚀: the SOTA open LLM balancing 📚text & 💻code capabilities! 1⃣Pretrain Llama 2 on ~100B code-focused data 2⃣Finetune Lemur on ~300K examples Download the models 🤗: See more details👇
@XLangNLP
XLang NLP Lab
1 year
1/6 Open LLMs have traditionally been tailored for either 📚text or 💻code, with limited ability to effectively balance both. 🚀 Introducing #Lemur70B ! 🚀: the SOTA open LLM balancing 📚text & 💻code capabilities 🤗Model: 📖Blog:
Tweet media one
2
34
73
8
25
76
@taoyds
Tao Yu
5 years
Come check out our #emnlp2019 paper "CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases" today at 4:30-6pm poster session! The data and leaderboard are available at .
Tweet media one
4
23
71
@taoyds
Tao Yu
1 year
If you are interested in LLM + tool use or tool augmented LLMs ⚙️ 🤖️⚒️, come and join us. we will cover this topic in our complex reasoning #ACL2023NLP tutorial!
@wzhao_nlp
Wenting Zhao
1 year
Heading to #ACL2023 🚀 My collaborators @megamor2 @billyuchenlin @michiyasunaga @aman_madaan @taoyds and I will be presenting a cutting-edge tutorial on Complex Reasoning in Natural Language - diving into recent methods for accurate, robust & trustworthy reasoning systems🤖 1/2
2
11
49
2
7
68
@taoyds
Tao Yu
6 years
Check out our #EMNLP2018 paper with @radevd "Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task" introduces a new text-to-SQL dataset! The data and blog available at and !
0
25
64
@taoyds
Tao Yu
1 year
Presenting a keynote today at 2 pm on “Language Model Agents for Building Natural Language Interfaces to Data” at the Databases and LLM (LLMDB) workshop () @VLDBconf . Please consider joining us if you are attending #VLDB2023 !
Tweet media one
0
12
53
@taoyds
Tao Yu
5 months
DS-1000 () code generation data format has now been simplified and hosted on @huggingface datasets. 1⃣Simplified format: 2⃣DS-1000 @huggingface : Credits: @halfrot01 and @sidawxyz
@taoyds
Tao Yu
2 years
💥New benchmark💥 DS-1000, a data science code generation benchmark with 1K questions about 7🐍libraries. Spent ~1200 expert hours! It is the only one that 1⃣ focuses on everyday applications 2⃣ includes natural intents & contexts 3⃣has test cases 1/🧵
Tweet media one
5
69
289
0
9
49
@taoyds
Tao Yu
7 months
Exciting time to work on computer agents! Though their research is still in the early stage, the potential is limitless. 🚀
@zywu_hku
Zhiyong Wu
7 months
I‘ve been dreaming of having my own "Jarvis" since years ago after the first Iron Man movie. Now I've finally brought my own version to life. Introducing OS-Copilot: A Framework for Generalist Computer Agents Paper: Website:
3
43
237
0
2
33
@taoyds
Tao Yu
1 year
thanks for sharing! the paper actually got pretty good reviews. 😎 anyway, yes, it has been downloaded 🔥~700K🔥 in ~1/2 year, used by 🚀>2k🚀 open source github projects! great work from our XLang NLP Lab @XLangAI led by @hongjin_su and @WeijiaShi2 !
@jxmnop
jack morris
1 year
one note for NLP people about findings vs main conference: the Instructor paper () was accepted to ACL as Findings (i.e. not the main conference) but every startup practitioner I talk to that has a GPU and cares about performance uses Instructor embeddings
2
19
95
1
5
30
@taoyds
Tao Yu
4 months
Thanks for attending! Big credit to Niklas Muennighoff @Muennighoff and Hongjin Su @hongjin_su !
@chrmanning
Christopher Manning
4 months
The best contributed paper on GRIT, presented by Tao Yu, is a nice contribution to doing RAG, but not exactly AGI. The quality of the speech captioning makes AGI seem quite distant indeed…. #ICLR2024
Tweet media one
3
14
106
1
4
44
@taoyds
Tao Yu
5 years
ACCEPTED to @ACL2019_Italy : 2 papers about Yale text-to-SQL Spider task (leaderboard: ) and our paper introducing the new context-dependent text-to-SQL SParC challenge with @ryanzhumich @VictoriaLinML @CaimingXiong @RichardSocher @radevd ! Coming up soon!
0
7
28
@taoyds
Tao Yu
4 years
Semantic Parsing (SP) evaluation has been a long-standing problem. Our #emnlp2020 paper (w. @ZhongRuiqi & Dan Klein) introduces a new metric that evaluates the predicted parse over multiple test suites. It is now the official metric of Spider, SParC, and CoSQL (+8 more SP data)!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
@BerkeleyNLP
BerkeleyNLP
4 years
Our #emnlp2020 paper() approximates the semantic accuracy of semantic parsing models by comparing the predicted meanings for “multiple possible worlds” rather than the logical forms. It is now the official metric of SPIDER, SParC, and CoSQL.
1
3
18
0
8
28
@taoyds
Tao Yu
5 years
Finally got the acceptance notification email! #emnlp2019
9
0
27
@taoyds
Tao Yu
10 months
🚀🚀🚀update: OpenAgents () is now on !
@ChengZhoujun
Zhoujun (Jorge) Cheng
11 months
💥OpenAgents💥: an OPEN platform for language agents in the wild Analyze data, call plugins, control your browser as ChatGPT Plus, but with OPEN Code for 1⃣Easy deployment 2⃣Full stack 3⃣Chat Web UI 4⃣Agent methods 5⃣… Code: 👇
5
70
223
0
5
28
@taoyds
Tao Yu
5 years
Come check out our #acl2019nlp paper introducing the new Cross-domain Semantic Parsing in Context (SParC) text-to-SQL challenge today (10:30-12:10pm, 7/31) at Poster Session 6A! Joint work with @ryanzhumich , @VictoriaLinML , @CaimingXiong , @RichardSocher , @SFResearch , and @radevd !
3
10
28
@taoyds
Tao Yu
2 years
@goodside Cool work! you might find our related work interesting to you, which binds GPT-3 Codex API calls in SQL/Python to resolve some complex questions. Check out our demo:
@taoyds
Tao Yu
2 years
📢📢 Play with our Binder demo: ! Binder: an easy but sota neural-symbolic built on GPT-3 Codex & SQL/Python interpreter. Inject GPT-3 Codex prompt API calls in programming languages!
2
22
129
0
0
26
@taoyds
Tao Yu
1 year
#NLProc students who plan to attend #ACL2023NLP : Apply to the student volunteer program! Deadline approaching in less than a week. It covers your conference registration fee in exchange for a few hours of work. Also, an good opportunity to network with fellow NLPers!
@aclmeeting
ACL 2025
1 year
📢 Call for Student Volunteers 📢 #ACL2023NLP is looking for student volunteers to help us with conference activities (both online and in-person). Checkout the call for more details. #ACL2023Toronto #NLProc
1
27
48
1
2
25
@taoyds
Tao Yu
7 months
NLP summer research intern @Hong Kong🏙️
@HKU_GS
HKU Graduate School
7 months
📣 HKU Summer Research Programme 2024 is now open for application! Join us to enrich your summer and get a taste of doing your own research project from scratch! 🌞 Deadline: 26 January 2024 (5pm HKT) Enquiry: gradsch @hku .hk APPLY NOW!
0
4
5
0
3
23
@taoyds
Tao Yu
2 years
📣 By formulating dialog state tracking as Text-to-SQL semantic parsing, In-Context Learning with Codex achieves impressive performance on MWoZ!
@huyushi98
Yushi Hu
2 years
In-Context Learning can solve hard dialogue understanding tasks —- when you frame the dialog task correctly. We find that by reframing dialogue state tracking into Text-to-SQL, and with a smart retriever, LMs get SOTAs on MultiWOZ without any training!🚀
Tweet media one
4
9
72
1
2
23
@taoyds
Tao Yu
1 year
🧵Lemur-70B-chat stands out as the top-performing open-source LLM, rivaling ChatGPT across a broader spectrum of tasks when compared to other available open-source LLMs.
@XLangNLP
XLang NLP Lab
1 year
4/6 Lemur-chat significantly outperforms other open-source supervised fine-tuned models across various dimensions.
Tweet media one
1
2
7
0
5
22
@taoyds
Tao Yu
1 year
All the slides of in our complex reasoning #ACL2023NLP tutorial are available at A paper collection on LLM + tool use ⚙️ 🤖️⚒️ and code generation are available at . PRs welcome if we've overlooked your work!
1
4
20
@taoyds
Tao Yu
2 years
Happening in 30 minutes! @TianbaoX and @ChenHenryWu will be giving an oral talk about UnifiedSKG and some recent works on leveraging GPT-3 Codex for structured knowledge grounding! Please join us in the semantics session, Hall A-B, 11 am. I'm also at #EMNLP2022 . Happy to chat!
@taoyds
Tao Yu
3 years
📣UnifiedSKG: Lots of #NLProc researchers separately study tasks that link text to structured knowledge (Table/DB/KB..). We unify 21 such tasks into a Seq2Seq format with T5 to foster idea sharing&multitasking, performing very competitive! Paper&Code: 👇
Tweet media one
3
40
207
1
3
20
@taoyds
Tao Yu
3 years
UnifiedSKG () is one of the shared tasks at SUKI! We provide strong but simple unified sota code and models for 21 tasks that involve structured knowledge. Also, there is another interesting shared task FinQA on financial data! Participations welcome!👇
@suki_2022
SUKI 2022
3 years
Hello World! Structured and Unstructured Knowledge Integration (SUKI) workshop at #NAACL2022 is welcoming submissions and shared task participations🙌! Papers due by April 8. Two shared tasks due by June 8 with cash awards🥰. Details are available 👉
Tweet media one
1
16
32
0
3
16
@taoyds
Tao Yu
2 years
Instructor👨‍🏫:ONE embedder, ANY task! Led by @hongjin_su & @WeijiaShi2 By simply providing a task instruction (❌training), a SINGLE instruction-finetuned👨‍🏫model 🥇generate domain-specific & task-aware text embeddings 🥈sota on 70 embed eval tasks Try🤗:
@WeijiaShi2
Weijia Shi
2 years
🙋‍♀️How to present the same text in diff. tasks/domains as diff. embeddings W/O training? We introduce Instructor👨‍🏫, an instruction-finetuned embedder that can generate text embeddings tailored to any task given the task instruction➡️sota on 7⃣0⃣tasks👇!
Tweet media one
12
115
598
0
1
15
@taoyds
Tao Yu
10 months
thanks for sharing our work! OpenAgents is now among one of the most popular open-source projects on Github trending ! OpenAgents code:
@omarsar0
elvis
10 months
OpenAgents - an open platform for using and hosting language agents in the wild. Includes three agents: - a Data Agent for data analysis - a Plugins Agent with 200+ daily API tools - a Web Agent for autonomous web browsing paper: code:
Tweet media one
3
115
587
0
2
12
@taoyds
Tao Yu
1 year
🪘🪘🪘If you want to learn about teaching LMs (ChatGPT/Codex) how to use code interpreters ⌨️and other tools/models 🔧🔨🪚 to resolve concrete tasks. Welcome to join us and have a meetup with my students @TianbaoX and @ChengZhoujun at #ICLR2023 !
@TianbaoX
Tianbao Xie
1 year
🎺"Binder: Binding Language Models in Symbolic Languages" is here #ICLR2023 on 5/3 (Wed) today! Join our talk by @ChengZhoujun and me at 3:00 pm in AD10 and poster at 4:30 pm at #57 ! website: code: demo:
0
3
19
1
0
12
@taoyds
Tao Yu
2 years
You can even apply Binder on multi-modal inputs(text, tables, and images). We explore using Binder on MultiModalQA and it performs better than Codex end2end QA and the fine-tuned baselines, even comparable with SOTA using oracle retrieved contents. 6/8
Tweet media one
1
1
10
@taoyds
Tao Yu
2 years
In the future, Binder is easily open to extensions to: ▪ new domains/tasks (e.g., knowledge base, pure text) ▪ new programming languages (e.g., SPARQL, more domain-specific symbolic languages) ▪ new LM API call functionalities (e.g., summarization, VQA) 7/8
2
0
10
@taoyds
Tao Yu
5 months
Our @XLangNLP team has spent six months on this project, and we're delighted to announce its completion! We hope OSWorld will open new research opportunities on multimodal agents!! Paper: OSWorld env, data, agent baselines:
0
0
9
@taoyds
Tao Yu
2 years
Does Codex memorize solutions from the web? It does! On numpy-100 (repeated >3K times on GitHub), Codex-002 performance drops from 72.5➡️40.6 after simple edits, w\o changes in difficulty. So we edited problems in DS-1000 to proactively defend against memorization. 4/🧵
Tweet media one
1
2
9
@taoyds
Tao Yu
1 year
It is a nice blog on text-to-SQL evaluation. Actually, we have improved the original execution based or exact match metrics in this paper: (test-suite accuracy, led by @ZhongRuiqi , code: ).
@ekzhu
Eric Zhu
1 year
Is Text-to-SQL evaluation really aligned with human preference? In this post I explore an alternative evaluation metric that more accurately match model performance in practice. Check it out to see how different GPT models perform!
4
25
168
1
0
9
@taoyds
Tao Yu
3 years
I'm grateful to many mentors, collaborators, and friends for their support and advice! Special thanks to Dragomir Radev, Kathleen McKeown, @LukeZettlemoyer , @OwenRambow , and @CaimingXiong !
0
0
8
@taoyds
Tao Yu
3 years
We benchmark all tasks in UnifiedSKG using T5 with very little task-specific modification. To our surprise, it achieves SOTA on almost all tasks! Larger models are better, and we expect the trend to continue.
Tweet media one
1
0
8
@taoyds
Tao Yu
2 years
Big congrats! Yale gets another #NLProc faculty🥳
@armancohan
Arman Cohan
2 years
✨Some personal news✨ I am very excited to share that I am joining Yale University @YaleCompsci @YaleSEAS @Yale as an Assistant Professor of Computer Science in Jan 2023! I'm looking forward to new connections and extensive collaborations @Yale in #NLProc , #AI , and beyond! 1/4
70
12
440
0
0
8
@taoyds
Tao Yu
2 years
Binder can achieve SOTA performance on question answering (WikiTableQuestions) and fact verification (TabFact), with only a few in-context annotated exemplars (no training)! Prev. best systems all require fine-tuning over massive amounts of data. 3/8
Tweet media one
1
0
8
@taoyds
Tao Yu
1 year
Exciting news!
@allen_ai
Ai2
1 year
Today we're thrilled to announce our new undertaking to collaboratively build the best open language model in the world: AI2 OLMo. Uniquely open, 70B parameters, coming early 2024 – join us!
34
194
662
1
1
8
@taoyds
Tao Yu
2 years
#NLProc #AI4Code GPT-3 Codex is possible to generate INTERACTIVE multi-vis interfaces📈 (not just static simple plots!) from natural language queries! Check out our demo below! Work led by @Yiru__Chen @sirrice . Stay tuned for fancier ones!
@Yiru__Chen
Yiru Chen
2 years
No programming, No learning curve! We can now generate INTERACTIVE multi-vis interfaces from NL queries! Yes! Directly from NL! Check this demo below. I will also give a talk on this next Sat in the NLVIS workshop @IEEEVIS . Paper:
1
3
45
0
0
7
@taoyds
Tao Yu
2 years
🆚End2end/chain-of-thought: Binder program’s deterministic execution entails prediction/answer➡️interpretable & robust��️ 🆚 #SemPar #AI4Code : Binder injects Codex functionalities in SQL/Python to handle more diverse questions➡️coverage⬆️ Demo: 2/8
1
0
8
@taoyds
Tao Yu
11 months
Time to read iclr submissions :)
@yihengxu_
Yiheng Xu
11 months
Tired of searching for keywords on openreview to explore the iclr2024 submissions. Spent some time writing code to dump the paper list from openreview and create some visualizations, collaborating with chatgpt and @nomic_ai . AI tools have indeed changed our way of working.
Tweet media one
4
27
162
0
0
8
@taoyds
Tao Yu
3 years
Because we have unified the architecture, we are now able to do multi-task learning! Multi-task prefix-tuning benefits most tasks and significantly improves the overall performance. We conjecture the reason to be knowledge sharing and cross-task generalization.
Tweet media one
1
0
7
@taoyds
Tao Yu
3 years
Finally, we conduct a comprehensive error analysis across SKG tasks. We find 1) Although the errors made by PLMs decrease with the model size, T5-3B may still generate invalid outputs. 2) Automatic metric is not sufficient for certain tasks. Find more details in the paper!
Tweet media one
1
0
6
@taoyds
Tao Yu
6 months
Great opportunity if you are interested in code generation!
@sidawxyz
Sida Wang
6 months
I'm hiring a PhD intern for the FAIR CodeGen (Code Llama) team. Do research on Code LLMs, execution feedback, evaluation, etc. Apply here:
3
31
198
0
1
7
@taoyds
Tao Yu
5 months
Looking forward to meeting you at HKU!
@WilliamWangNLP
William Wang
5 months
Upcoming seminar at the University of Hong Kong, Thursday 4/18. Looking forward to meeting new and old friend! 🇭🇰
Tweet media one
1
4
43
0
0
7
@taoyds
Tao Yu
3 years
Structured Knowledge Grounding (SKG) tasks were studied by different communities, leading to divergent architectures and implementations. Unification decreases barriers for newcomers and encourages methods that generalize across tasks. UnifiedSKG unifies 21 tasks into Seq2Seq.
Tweet media one
1
0
7
@taoyds
Tao Yu
2 years
DS-1000 construction includes 1⃣ selected and rewrote problems from StackOverflow 2⃣ perturbed the problems to defend against potential memorization 3⃣implemented a customized evaluation metric for EVERY SINGLE PROBLEM Very labor intensive. Took five authors ~1200 hours! 2/🧵
Tweet media one
1
0
6
@taoyds
Tao Yu
1 year
Wow, Exciting!!!
@ylecun
Yann LeCun
1 year
This is huge: Llama-v2 is open source, with a license that authorizes commercial use! This is going to change the landscape of the LLM market. Llama-v2 is available on Microsoft Azure and will be available on AWS, Hugging Face and other providers Pretrained and fine-tuned
422
4K
16K
0
0
5
@taoyds
Tao Yu
2 years
Binder is highly ROBUST to large or noisy inputs! End2end QA performance drops dramatically as table input size increases (-42.0%), while Binder consistently outperforms it with only slight decreases (-13.3%). A similar phenomenon is seen given noisy inputs as distractors. 5/8
Tweet media one
1
0
5
@taoyds
Tao Yu
3 years
UnifiedSKG allows us to systematically investigate structured knowledge encoding and obtain insights generalizable across tasks. Though T5 reaches SOTA on most tasks, it is still sensitive to the encoding method. We need more robust encoding methods in the future!
1
0
5
@taoyds
Tao Yu
10 months
Congrats, Jungo! 🎉🚀🔥
@jungokasai
Jungo Kasai 笠井淳吾
10 months
Exciting life updates! Nori ( @noriyuki_kojima ) and I co-founded Kotoba Technologies, Inc. ( @kotoba_tech ), which develops LLMs for businesses in Japan and non-English speaking countries. We have a Tokyo office in Roppongi and are expanding day by day with new projects and members.
5
27
175
0
0
5
@taoyds
Tao Yu
3 years
UnifiedSKG is still challenging for zero/few-shot learning. T0/GPT-3/Codex all struggle to reach satisfactory performance. We need future research to adopt those models to encode structured knowledge!
Tweet media one
1
0
5
@taoyds
Tao Yu
11 months
Last year's highlights include: for robotic task planning and beyond!
0
0
4
@taoyds
Tao Yu
2 years
0
0
4
@taoyds
Tao Yu
5 years
Our Text-to-SQL Challenge Series: single-turn Spider(), multi-turn SParC(), and finally conversational CoSQL()! Hope they help #NLProc to build next-generation natural language interfaces to databases!
0
1
4
@taoyds
Tao Yu
2 years
INTERPRETABILITY of the Binder program (deterministically execute it to derive the answer/prediction) can assist fine-grained error analyses and human debugging (more explicit error causes). 4/8
Tweet media one
1
0
4
@taoyds
Tao Yu
1 year
Also for the applicants: ACL registration is open, but please💥DO NOT💥register before June 5th. We will send an email to all applicants by June 5th to inform you of our decision. 1/2
1
0
3
@taoyds
Tao Yu
3 years
@alexfabbri4 @SFResearch Congrats! Alexxxxx😋
0
0
3
@taoyds
Tao Yu
2 years
DS-1000 contains 1K problems from 451 unique StackOverflow problems. Compared w. other datasets, DS-1000 is the only one that 1⃣ focuses on everyday data science applications 2⃣ includes naturalistic intents and contexts 3⃣has a reliable execution-based evaluation metric. 5/🧵
Tweet media one
Tweet media two
1
0
3
@taoyds
Tao Yu
4 months
@alsuhr @universeinanegg Thank @alsuhr for sharing our work~🤣
0
0
2
@taoyds
Tao Yu
1 year
Successful applicants will receive a registration discount code to waive your registration fee. If you are not selected, you will still be able to register at the early registration rate. 2/2
0
0
2
@taoyds
Tao Yu
2 years
To evaluate our automatic metric, we check whether it can reject incorrect solutions. The authors manually reviewed ~3 THOUSAND Codex-002 example predictions and found that our metric is reliable. Among all solutions it accepts, only 1.8% of them are wrong. 3/🧵
Tweet media one
1
0
2
@taoyds
Tao Yu
1 year
including @LangChainAI , @MosaicML , and much more!
0
0
2
@taoyds
Tao Yu
1 year
@wittgen_ball @dragomir_radev so sad... Without him, we wouldn't have been acquainted.
0
0
2
@taoyds
Tao Yu
2 years
We used DS-1000 to benchmark five pre-trained code models from three different families. The best model Codex-002 Insertion achieves 43.3% accuracy, indicating room for improvement. 6/🧵
Tweet media one
1
0
2
@taoyds
Tao Yu
3 years
@yusuOSU Thanks! Happy Chinese New Year!
0
0
2
@taoyds
Tao Yu
3 years
@nlpnoah @uwnlp Thanks, Noah! Looking forward to working with you soon.
0
0
1
@taoyds
Tao Yu
3 years
@alexfabbri4 Haha, thanks, Alex!😊 Will visit you in NYC sometime this year~
0
0
1
@taoyds
Tao Yu
5 months
0
0
1
@taoyds
Tao Yu
3 years
@ZiyuYao Thanks, Ziyu!😀 Hope you'll soon feel settled in Virginia!
0
0
1
@taoyds
Tao Yu
4 months
@FanaHOVA thanks for the kind words! 😃
0
0
1
@taoyds
Tao Yu
1 year
0
0
1
@taoyds
Tao Yu
3 years
@JunjieHu12 @HKUniversity @ikekong @uwnlp @nlpnoah Thanks, Junjie for the advice during the search. Hope you have a great start at UWM!
0
0
1
@taoyds
Tao Yu
1 year
Also please consider sharing your memory to honor him at
0
0
1
@taoyds
Tao Yu
3 years
@jasonwu0731 Thanks, Jason! It was great working with you. Let me know when you visit HK again!!😃
0
0
1
@taoyds
Tao Yu
3 years
@mrnt0810 @HKUniversity @ikekong @uwnlp @nlpnoah Thanks, Tong! Hope to meet you again at conferences!
0
0
1
@taoyds
Tao Yu
1 year
0
0
1
@taoyds
Tao Yu
3 years
@ruizhang_nlp haha, thanks Rui. 😀
0
0
1
@taoyds
Tao Yu
3 years
@CaimingXiong @HKUniversity @ikekong @uwnlp @nlpnoah Thank you, Caiming! Very fortunate to be advised by you.😀
0
0
1