Terra Blevins Profile
Terra Blevins

@TerraBlvns

Followers
754
Following
468
Media
21
Statuses
82

Postdoc @ViennaNLP and incoming asst professor @Northeastern @KhouryCollege || Formerly @uwcse || she/her

Seattle, WA
Joined July 2016
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@TerraBlvns
Terra Blevins
22 days
Universal NER is gearing up for our next data release!! We're still looking for many common-spoken languages (Spanish, Hindi, and more!), so check out the blogpost and discord if you want to help build UNER v2 ⬇️
@mayhewsw
Stephen Mayhew
22 days
The Universal NER project had a great year 🎉, with a data release and a NAACL paper. Now we're gearing up for the next one, aiming to add 7 more languages by the end of the year. Want to help out? Discord here: Read more here:
3
9
34
0
2
14
@TerraBlvns
Terra Blevins
10 days
I’m very excited to join @Northeastern @KhouryCollege as an assistant professor starting Fall '25!! Looking forward to working with the amazing people there! Until then I'll be a postdoc at @ViennaNLP with Ben Roth, so reach out if you want to meet up while I'm over in Europe ✨
28
15
281
@TerraBlvns
Terra Blevins
2 years
Are any large-scale pretrained models truly monolingual? In new work with @LukeZettlemoyer , we find that automatic data collection methods leak millions of non-English tokens into popular pretraining corpora. (1/3) ✨Paper✨:
Tweet media one
4
40
212
@TerraBlvns
Terra Blevins
7 months
Expert language models go multilingual! Introducing ✨X-ELM✨(Cross-lingual Expert Language Models), a multilingual generalization of the BTM paradigm to efficiently and fairly scale model capacity for many languages! Paper:
Tweet media one
2
40
174
@TerraBlvns
Terra Blevins
2 years
When do multilingual LLMs learn to transfer between languages? In our new preprint, we probe XLM-R across training checkpoints and find that in-language and cross-lingual training dynamics are very different! 💫Paper: (w/ @hila_gonen and @LukeZettlemoyer )
Tweet media one
2
20
147
@TerraBlvns
Terra Blevins
2 years
Very excited this work is accepted to #EMNLP2022 ! Please take a look if you're interested in how monolingual models learn cross-lingual transfer
@TerraBlvns
Terra Blevins
2 years
Are any large-scale pretrained models truly monolingual? In new work with @LukeZettlemoyer , we find that automatic data collection methods leak millions of non-English tokens into popular pretraining corpora. (1/3) ✨Paper✨:
Tweet media one
4
40
212
0
10
112
@TerraBlvns
Terra Blevins
2 years
What do autoregressive LMs know about linguistic structure? We introduce structured prompting, an approach for probing LMs by extending prompting to sequence tagging tasks without training. 🌟Paper: (w/ @hila_gonen and @LukeZettlemoyer )
2
19
93
@TerraBlvns
Terra Blevins
1 year
New paper alert!! ✨ Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models (PLMs) ✨ We evaluate how well PLMs translate words in context and then leverage this prompting setup to perform zero-shot WSD on 18 languages! 1/n
Tweet media one
1
25
64
@TerraBlvns
Terra Blevins
3 years
WSD models for English perform as well as humans on common senses, but worse on rare or novel senses. FEWS, our new dataset at #EACL2021 , focuses on evaluating WSD models on these challenging senses in a low-shot setting. (1/3)
Tweet media one
Tweet media two
1
15
55
@TerraBlvns
Terra Blevins
2 years
This work will be at #EMNLP2022 ! Now with new experiments exploring how linguistic knowledge changes across layers over time -- we find that information is acquired at the final layer and then pushed down to lower ones during pretraining
@TerraBlvns
Terra Blevins
2 years
When do multilingual LLMs learn to transfer between languages? In our new preprint, we probe XLM-R across training checkpoints and find that in-language and cross-lingual training dynamics are very different! 💫Paper: (w/ @hila_gonen and @LukeZettlemoyer )
Tweet media one
2
20
147
1
7
51
@TerraBlvns
Terra Blevins
1 year
🎆 Super excited that this paper was accepted at #ACL2023NLP ! Check it out if you're interested in how in-context learning works ➡️ We find prompting is heavily affected by task knowledge in the pretraining data, but can still generalize to unseen but descriptive labels.
@TerraBlvns
Terra Blevins
2 years
What do autoregressive LMs know about linguistic structure? We introduce structured prompting, an approach for probing LMs by extending prompting to sequence tagging tasks without training. 🌟Paper: (w/ @hila_gonen and @LukeZettlemoyer )
2
19
93
1
7
37
@TerraBlvns
Terra Blevins
2 months
I'm presenting our new ✨Univeral NER dataset✨ at 2 PM tomorrow at #NAACL2024 (in Don Diego, Poster Session 2). Stop by if you're interested in multilingual benchmarks and/or cross-lingual NER!
@tellarin
Börje Karlsson
6 months
Quality multilingual annotated data is always scarce, so I'm extra happy to see ✨Universal NER✨ has been accepted at #NAACL2024 . We hope the project will help address the data gap and facilitate new multilingual/cross-lingual research! 🎉 Preprint:
Tweet media one
1
12
74
0
8
34
@TerraBlvns
Terra Blevins
1 year
💫 New paper! 💫 TLDR: We explore the embedding space of XLM-R and show that we can effectively reinitialize the vocabulary with simple heuristics mimicking the structure! This lets us more efficiently adapt the model to new languages and downstream tasks🔋
@cmdowney
C.M. Downey
1 year
What's the best way to specialize multilingual LMs for new languages? We address this in our new paper! Embedding structure matters: Comparing methods to adapt multilingual vocabularies to new languages () With @terrablvns , Nora Goldfine, and @ssshanest
Tweet media one
1
1
19
1
4
27
@TerraBlvns
Terra Blevins
6 months
I'm excited to present our work on "Translate to Disambiguate" at #EACL2024 🇲🇹. Come by the Multilingual Issues oral session tomorrow (03/19) at 10:30 in Marie Louise to learn more!!
@TerraBlvns
Terra Blevins
1 year
New paper alert!! ✨ Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models (PLMs) ✨ We evaluate how well PLMs translate words in context and then leverage this prompting setup to perform zero-shot WSD on 18 languages! 1/n
Tweet media one
1
25
64
0
1
26
@TerraBlvns
Terra Blevins
9 months
Excited to share this #EMNLP2023 Findings paper here in Singapore! I'll present this tomorrow (Dec. 7) at 11:30 in the East Foyer, and again on Dec. 9 at 9am in the East Foyer. Come chat about how we can 🪄demystify our prompts
@hila_gonen
Hila Gonen
2 years
Why not all Prompts are Created Equal? Check out our preprint "Demystifying Prompts in Language Models via Perplexity Estimation" Joint work with @sriniiyer88 @TerraBlvns @nlpnoah @LukeZettlemoyer 🧵 paper:
Tweet media one
5
28
148
0
3
26
@TerraBlvns
Terra Blevins
2 years
Prior work has found that monolingual models transfer surprisingly well across languages -- we show that cross-lingual performance is strongly correlated with the amount of data leaked in the pretraining corpus. (3/3)
2
1
26
@TerraBlvns
Terra Blevins
1 year
Super excited about this article on some of the more surprising findings of our #ACL2023 paper on structured prompting! Stop by our poster (Tues, 9am-10:30 EST) to chat if you want to learn more ✨
@gradientpub
The Gradient
1 year
LLMs are amazing, but they are also highly sensitive to how you write the input prompt Their performance can vary greatly when only slight changes to the prompting format are made -- even when these changes mean the same thing to a human Learn more:
1
10
29
0
6
20
@TerraBlvns
Terra Blevins
10 months
This project has been so much fun to work on! Check out this thread to learn about our new multilingual NER dataset, 🪐Universal NER🌟
@mayhewsw
Stephen Mayhew
10 months
🚨 New Dataset Alert 🚨 I'm extremely excited to announce Universal NER v1, available now. It is gold-standard human annotations of 18 datasets covering 12 languages, based on Universal Dependencies texts. This is the first data release of the UNER project. 1/3
Tweet media one
5
53
236
0
7
21
@TerraBlvns
Terra Blevins
2 years
And we find the label set choice makes a big difference! Shuffling the task labels confuses GPT-NeoX and hurts performance, but the model can learn in context on these tasks from English words for tags and even unrelated proxy labels, like numbers.
Tweet media one
Tweet media two
1
0
11
@TerraBlvns
Terra Blevins
2 years
(3) For many languages/pairs, performance by the final checkpoint is lower than the best seen during pretraining. Consequently, the best checkpoint according to our probes varies by language
Tweet media one
0
0
8
@TerraBlvns
Terra Blevins
2 years
These findings indicate that model understanding of linguistic structure is more general than task memorization and this approach to prompting opens up a new avenue for analyzing large autoregressive LMs. Check out the paper for more analysis and findings!
1
1
8
@TerraBlvns
Terra Blevins
2 years
Data analysis of the Pile shows that these results could be due to labeled task data seen during pretraining -- to control for this, we experiment with using different labels sets with structured prompting
Tweet media one
1
0
7
@TerraBlvns
Terra Blevins
3 years
We find current SoTA methods lag behind humans on FEWS, indicating that it will support significant future work on low-shot WSD. Work done with @mandarjoshi_ and @lukezettlemoyer . (3/3)
Tweet media one
Tweet media two
1
1
6
@TerraBlvns
Terra Blevins
2 years
(2) Syntax is consistently learned before semantics within language, but, interestingly, this ordering doesn't necessarily hold when transferring between languages
1
0
5
@TerraBlvns
Terra Blevins
2 years
We find that: (1) XLM-R captures in-language linguistics very early on but can take significantly longer to acquire cross-lingual behavior
Tweet media one
Tweet media two
1
0
5
@TerraBlvns
Terra Blevins
2 years
Specifically, we see that although the overall percentages of non-English text in these corpora are small, this corresponds to millions of out-of-language tokens (2/3)
Tweet media one
2
0
4
@TerraBlvns
Terra Blevins
7 months
A key factor of multilingual branch-train-merge (x-BTM) is that language typology is hierarchical: we design a new data clustering method to build a balanced typological tree, cluster languages by similarity, and train X-ELMs on these data clusters... 3/n
Tweet media one
1
0
4
@TerraBlvns
Terra Blevins
1 year
@yong_zhengxin No, you aren't missing anything! This is a summary of some of the results in Table 1
0
0
1
@TerraBlvns
Terra Blevins
7 months
tl;dr: We train multiple LMs on different subsets of multilingual data, and this improves modeling on 🌟every language🌟 over a single model trained on all the data (and languages) shared across expert LMs! 2/n
1
0
3
@TerraBlvns
Terra Blevins
2 years
Models like GPT-Neo (below) achieve strong few-shot performance on structure prediction tasks (e.g., POS tagging, sentence chunking, and NER), and this performance scales with model and demonstration set size
Tweet media one
Tweet media two
1
0
3
@TerraBlvns
Terra Blevins
7 months
❔Can we add new languages to X-ELM? Also yes! With multi-round x-BTM, we branch from existing experts and train on new languages (LAPT) or settings. This efficiently adapts X-ELM to new languages without forgetting current languages, as existing experts remain unchanged 6/n
Tweet media one
1
0
3
@TerraBlvns
Terra Blevins
1 year
We find that adding context to the word-level translation task (C-WLT) improves performance for both English and multilingual PLMs: it helps fix translation errors and allows the model to generate rarer synonyms in context 📈 2/n
Tweet media one
Tweet media two
1
0
3
@TerraBlvns
Terra Blevins
7 months
❔Does this improve multilingual language modeling performance? Yes! We see perplexity improvements over both the seed model (XGLM-1.7B) and the compute-matched dense baseline that is trained on all of the languages 5/n
Tweet media one
1
0
2
@TerraBlvns
Terra Blevins
2 years
@evgeniyzhe Good question -- it's due to a tokenization artifact in some of the models. I've added Appendix B to the paper to discuss this (on Arxiv soon, but you can read it here: )
1
0
2
@TerraBlvns
Terra Blevins
7 months
...with x-BTM! 1. Branch from an existing multilingual LM to give the X-ELMs a shared starting point 2. Train each expert on their assigned data cluster completely asynchronously from the other X-ELMs 3. Merge the experts into an X-ELM set for inference 4/n
Tweet media one
1
0
2
@TerraBlvns
Terra Blevins
2 years
@Wjrgo @srchvrs @mayhewsw @yanaiela @ryandcotterell @LukeZettlemoyer I don't think the model could handle a script not seen during pretraining -- it wouldn't have subword embeddings to handle those cases (or, if you included them, they'd be random/untrained). Appendix B of our paper discusses tokenization effects and how it affects performance.
1
0
2
@TerraBlvns
Terra Blevins
1 year
Inspired by this, we propose WSD via C-WLT, a zero-shot method for multilingual word sense disambiguation. By ensembling translations from typologically diverse languages, this approach can achieve similar recall on WSD to fully supervised prior work with no further training. 3/n
Tweet media one
1
0
1
@TerraBlvns
Terra Blevins
3 years
FEWS has high sense coverage across different domains and provides: (1) a large training set that covers many more senses than previous datasets and (2) a comprehensive evaluation set containing few- and zero-shot examples of a wide variety of senses. (2/3)
Tweet media one
Tweet media two
1
1
1
@TerraBlvns
Terra Blevins
3 years
Here’s a link to the FEWS dataset website:
0
0
1
@TerraBlvns
Terra Blevins
2 years
@yanaiela I was at Meta/Facebook while working on this project!
0
0
1
@TerraBlvns
Terra Blevins
2 years
We've released the intermediate model checkpoints at , so check them out if you are interested in exploring more about multilingual pretraining dynamics!
0
0
1
@TerraBlvns
Terra Blevins
1 year
🔍 But the choice of model matters! While the multilingual PLMs disambiguate languages they saw during pretraining well, larger "English" models actually generalize better to other unseen languages 4/n
Tweet media one
2
0
1