A Design Space for Intelligent and Interactive Writing Assistants
#CHI2024
👩🏻✏️🤖
What writing assistants do you use? What else are out there and how do they differ? What do we need to consider when designing new writing assistants?
🔗 (1/6)
CoAuthor: Human-AI Collaborative Writing Dataset
#CHI2022
👩🦰🤖 CoAuthor captures rich interactions between 63 writers and GPT-3 across 1445 writing sessions
Paper & dataset (replay):
Joint work with
@percyliang
@fabulousQian
🙌
I'll present and defend my work on writing with AI over the past 6 years 👩🦰🤖✍️
If you are interested in the topic and happen to be free this upcoming Tuesday at 4-5PM PT, you are welcome to attend my defense over zoom!
Language models (LMs) are already deployed in many real-world applications and used to interact with users 👩🦰, but these models are primarily evaluated non-interactively.
How can we evaluate LMs interactively and why is it important? (1/8)
I'll be joining the University of Chicago (
@UChicagoCS
and
@DSI_UChicago
) as an assistant professor in the summer of 2024. 👩🏻✨
Excited to start the Communication & Intelligence group with Ari Holtzman (
@universeinanegg
) and Chenhao Tan (
@ChenhaoTan
)!
Announcing the Communication & Intelligence (C&I) group at UChicago!
Comprised of
@universeinanegg
,
@MinaLee__
,
@ChenhaoTan
—C&I will tackle how AI and communication co-evolve as LLMs break long-held assumptions.
We're recruiting PhDs & Postdocs for 2024!
We are open-sourcing the CoAuthor interface for those who are interested in using a text editor to record and replay human-LM collaborative writing sessions!👩🦰🤖✍️
Code: (joint work with
@vishakh_pk
)
CoAuthor: Human-AI Collaborative Writing Dataset
#CHI2022
👩🦰🤖 CoAuthor captures rich interactions between 63 writers and GPT-3 across 1445 writing sessions
Paper & dataset (replay):
Joint work with
@percyliang
@fabulousQian
🙌
I got selected as one of the Korean Innovators Under 35 by MIT Technology Review! 🇰🇷🏆
Happy and honored! Will do my best to continue to learn, grow, and do good research. ☺️
Honored to be on the list! 🙏
I’m actively recruiting students who are interested in AI and writing to understand how AI will change the way we communicate. ✍️ Please consider applying to
@UChicagoCS
and
@DSI_UChicago
!
More info:
Can we simulate both a user👩🦰 and system 🤖 and learn to autocomplete in an unsupervised way?
Yes! We frame the autocomplete task as a cooperative communication game.
#StanfordNLP
Talk: Dec 14 9:45-10AM (West 118)
#NeurIPS2019
How to fill in the __blanks__ using language models
1. Download your favorite language model ❤️
2. Fine-tune the model on infilling examples 🤖
3. Use the model to fill in any number of blanks in text! 😮😮😮
#StanfordNLP
#ACL2020NLP
@chrisdonahuey
📢 Intelligent & Interactive Writing Assistants Workshop ()
#In2Writing
#ACL2022
We invite NLP and HCI researchers as well as industry practitioners and professional writers to build, improve, and evaluate AI-powered writing assistants! 👩🦰🤖✏️
Finally releasing my old book on how to apply to grad school 📕 (written in Korean) as a PDF file 😊
미국 대학원에 관심있는 분들이 주변에 있다면 공유해주세요 ❤️ 블로그 글도 계속해서 쓰려고 하는데 잘 써지지가 않네요 🫠
Curious how writers write with language models / how industries build writing assistants / how researchers evaluate their performance?
Workshop on Intelligent and Interactive Writing Assistants
#In2Writing
is happening tomorrow at
#ACL2022
! 🤖👩🦰✏️
With Sherry Wu, Mina Lee, Ken Holstein, Vera Liao, Hari Subramonyom, and Min Kyung Lee, I am organizing a CSCW panel to discuss how LLMs could influence CSCW and social computing research & vice versa.
What would you want to ask the panel?
When building AI writing support tools, what are the most important factors to consider? How can we make them safe and inclusive? How will they change the way we write?
Join the conversation at our workshop on Intelligent and Interactive Writing Assistants!
#CHI2023
✍️👩💻
The 2nd
#In2Writing
workshop will be at
#CHI2023
!
This year we invite 🔥2-page position papers🔥 that portray thoughts on writing assistants (see CFP). Submit a paper to join the in-person event!
🗓️Submission Deadline: 2/23
🗓️Workshop Date: 4/23 (Sun)
🔗
In a classroom setting, how can we harness the potential while minimizing the risks of large language models such as ChatGPT? 👩🏫👨🎓
Check out this article by Inside Higher Ed:
How can we synthesize programs with functional correctness? 🤖✔️
We release SPoC dataset (18K programs + human-authored pseudocode). Poster
#169
at 5-7PM today!
#NeurIPS2019
SPoC: Search-based Pseudocode to Code (
@sumith1896
,
@IcePasupat
et al.)
Valentine's Day + AI = AI in Dating Apps? ❤️💔
It was both interesting and concerning to think about the use cases of AI for dating. Thanks
@mollyglick
for interviewing me!
For
@inversedotcom
’s Future of Love package, I explored how AI could change dating forever — from machine learning matchmakers to increasingly monotonous convos:
For example, datasets can answer:
✏️Can GPT-3 contribute new ideas to one's writing?
✏️Does this ideation capability fluctuate in different writing contexts?
✏️To what extent does it fluctuate when decoding parameters change?
Find answers to these questions in our paper!
Thank you
@fabulousQian
! Check out our paper and dataset here: ✍️
#CHI2022
By the way, I seem to have closed my eyes for all the photos for some reason 🫠
PC:
@joon_s_pk
Fantastic talk -
@MinaLee__
’s
#chi2022
presentation of our Best Paper Honorable Mention paper “CoAuthor” (in collaboration with
@percyliang
)— A much needed discussion on the capabilities and limits of LLMs when interacting with users!
Language models have highly context-dependent capabilities that are often subjectively interpreted 🤔
We argue that datasets can be one way to understand language models' generative capabilities in relation to interaction design, under various definitions of good collaboration!
We release the reviewed papers along with their design space annotations to help you navigate existing writing assistants:
🔗
This is a living artifact and we invite you to add new papers, features, and discussions to track future developments. (3/6)
We propose a design space as a structured way to explore the space of writing assistants.
By systematically reviewing 115 papers from HCI and NLP, we identify key design considerations around task, user, technology, interaction, and ecosystem. (2/6)
We design five tasks, ranging from goal-oriented to open-ended, to capture a variety of different interactions. Then, we evaluate three variants of OpenAI’s GPT-3 and AI21’s Jurassic-1 by constructing interactive systems that connect LMs and users via an interface. (3/8)
From 1000+ human-LM interactions, we first find that non-interactive performance and interactive performance can diverge. For example, a model with the worst accuracy achieved the best performance as an interactive LM assistant in certain contexts. (4/8)
Second, users sometimes perceived LMs to be more helpful than they are. For instance, users were easily deceived by misinformation from LMs, especially when they were fluent. We further observe that short prompts tend to exacerbate misinformation and toxicity. (5/8)
Lastly, we observe a discrepancy between metrics based on third-party and first-person perspectives, suggesting that what users find helpful in interacting with and improving these models over time might be not captured by standard benchmarking practices. (6/8)
If you are interested in human-machine interactive text revision, please check out our paper, which won 🎉the best paper award🎉 at
#In2Writing
workshop at
#ACL2022
! We attach more details below.
Paper:
Demo:
We develop a framework, HALIE, that captures (i) the interactive process, not only the final output; (ii) the first-person subjective experience, not just a third-party assessment; and (iii) notions of preference beyond quality. (2/8)
@AliceAlbrecht
@percyliang
@fabulousQian
There has been big power outage at Stanford which is affecting the server that’s hosting the website ☠️ Hopefully it will get fixed soon! Meanwhile, you can access it via — happy researching! 😊
@m_guerini
@percyliang
@fabulousQian
2/n During data collection, we didn't analyze factuality of GPT- or human-written text as we were not experimenting any kind of interventions.
However, I'm quite certain that GPT-3 must have hallucinated, which might have made humans generate invalid arguments as a result.
@m_guerini
@percyliang
@fabulousQian
3/3 In fact, one participant reported "there was not a single source cited that was accurate--It had quotes from unrelated authors, studies, websites" 😂 It would be great future work to analyze how GPT-3 impacts humans' factuality (similar to named entity analysis in the paper)!
@m_guerini
@percyliang
@fabulousQian
1/n Enjoyed reading your paper! It is amazing that your annotators underwent thorough training for two weeks and were asked to check veracity of model outputs 😮
While thinking about the topic, I came across "AI and education: guidance for policy-makers" by UNESCO (), which includes benefit-risk assessments and policy recommendations for using AI in education.
What other reading resources would you recommend?