Isabel Papadimitriou Profile
Isabel Papadimitriou

@isabelpapad

Followers
929
Following
124
Media
12
Statuses
56

PhD student at @stanfordnlp , working with @jurafsky . Incoming fellow at @KempnerInst and (then) assistant professor at @UBCLinguistics .

Joined November 2020
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@isabelpapad
Isabel Papadimitriou
1 year
Check out our new paper with @jurafsky ! What inductive learning biases influence language learning, and how? We pretrain transformer models on different structures and fine-tune on English to test the learning effects of different inductive biases
Tweet media one
2
54
254
@isabelpapad
Isabel Papadimitriou
2 years
🧅 BERT does well without word order, but most sentences are easy! You know that the unordered words chopped+chef+onions describe a chef chopping onions and not the other way around. What if we test on cases where word order matters? @rljfutrell @kmahowald
Tweet media one
5
37
262
@isabelpapad
Isabel Papadimitriou
2 years
We tend to evaluate multilingual models with downstream tasks. But what if there’s more subtle ways that fluency is affected, like having an English “accent” when learning a second language? Check out our new paper with Kezia Lopez and @jurafsky !
Tweet media one
2
38
206
@isabelpapad
Isabel Papadimitriou
3 months
Really really excited to be joining @UBCLinguistics ! I'm so happy to get to work with the lovely people in the department I'll be going to @KempnerInst in the interim, again lucky to work with lovely, interdisciplinary people I'd love to hang if you're in Boston or Vancouver!
@UBCLinguistics
UBC Linguistics
3 months
We are thrilled that Isabel Papadimitriou ( @isabelpapad ) will be joining @UBCLinguistics as an Assistant Professor as of Sept 2025!
1
10
82
25
12
180
@isabelpapad
Isabel Papadimitriou
4 years
"Learning Music Helps You Read", our #emnlp2020 paper with @jurafsky ! We use transfer learning between language, music, code and parentheses. Some exciting results about encoding and transfer of abstract structure, and the role (or not!) of recursion
Tweet media one
1
29
158
@isabelpapad
Isabel Papadimitriou
4 years
Abstract morphosyntax? In Multilingual Bert?! Yes! Check out our #EACL2021 paper, where we look at how Multilingual BERT representations are influenced by high-order properties of whole languages (rather than just features of inputs). With @rljfutrell @ethanachi @kmahowald
Tweet media one
2
17
92
@isabelpapad
Isabel Papadimitriou
1 month
Isaac does some of the most impactful NLP work that I know! This is not just 'Google LM magic', it's the result of extremely hard-nosed and outside-the-box data work and linguistic work, as well as working with speakers. And the whole combo that makes Isaac Isaac!
@iseeaswell
Isaac R Caswell
1 month
Excited to announce that 110 languages got added to Google Translate today! Time for context on these languages, especially the communities who helped a lot over the past few years, including Cantonese, NKo, and Faroese volunteers. Also, a 110-language youtube playlist. 🧵
15
63
239
2
5
61
@isabelpapad
Isabel Papadimitriou
2 years
I'm at NAACL in person (come chat!) and very excited to be giving a keynote at the @sig_typ workshop! I'll talk about What we can learn about language from exploring multilingual language models. Bold title, will she hedge in the talk? Come see for yourself this Thursday at 3:30!
0
8
55
@isabelpapad
Isabel Papadimitriou
3 years
Released multilingual datasets often contain corpora that are not the language they claim to be. This paper is a big annotation effort to document this in detail! My takeaway: take a look at sample sentences when using a corpus, and be cautious around multilingual language models
@iseeaswell
Isaac R Caswell
3 years
Does the data used for multilingual modeling really contain content in the languages it says it does? Short answer: sometimes 🙁 1/n
7
53
126
1
2
29
@isabelpapad
Isabel Papadimitriou
2 years
Lovely to work on this with @ZhengxuanZenWu and @AlexTamkin ! Check out our controlled studies if you've ever wondered about the effects of tokenization, embedding reinitialization, and structural shift in cross-lingual transfer 🍵
@AlexTamkin
Alex Tamkin
2 years
What makes it hard for NLP models to learn a new language? New work "spilling the tea" on crosslingual transfer with @ZhengxuanZenWu and @isabelpapad ! Thread👇 1/
Tweet media one
4
31
140
0
8
26
@isabelpapad
Isabel Papadimitriou
7 months
Looking to crispen up your research narrative, present an in-progress idea, or get supportive feedback on your paper? And present at NAACL? Submit to the Student Research Workshop! We have tracks for both papers and draft thesis proposals. See you there!
@naacl_srw
NAACL SRW
8 months
📢 Exciting News! 📢 The Call For Papers for NAACL SRW 2024 is officially open! 🚀 🗓️ Mentoring Program Deadline: Jan 19, 2024 📝 Paper Submission Deadline: Mar 1, 2024 Don't miss this opportunity to showcase your research and contribute to the advancement of NLP! 💡
1
3
3
0
5
17
@isabelpapad
Isabel Papadimitriou
5 months
The NAACL Student Research Workshop is short a few reviewers! Please consider volunteering to review. This is a great way to give back to the community and help students publish at their first *CL conference! We really appreciate it
3
4
11
@isabelpapad
Isabel Papadimitriou
2 years
In our #NLProc #acl2022nlp paper, “When classifying grammatical role, BERT doesn't care about word order... except when it matters”, we compare BERT on an argument role probing task for easy sentences vs. hard (non-prototypical) sentences, where you really do need word order.
1
0
10
@isabelpapad
Isabel Papadimitriou
2 months
Good morning everyone in Mexico for NAACL! Join us at 9:30 for the Student Research Workshop panel! PhD students and undergrads: come with questions about research, grad school, job market, or anything you might want to hear from our lovely panelists!
@naacl_srw
NAACL SRW
2 months
Kicking off tomorrow (Tue) #NAACL2024 with our amazing panel at 9:30-10:30. We’ll chat with @jieyuzhao11 @navitagoyal_ @shi_weiyan @krisgligoric about challenges & advices for student researchers 🎓. Also, taking a group photo at the end of the poster session. Come & join us 😎
Tweet media one
0
3
12
0
1
9
@isabelpapad
Isabel Papadimitriou
1 year
Language models give us a cool testing ground to examine issues around the learnability of language, and examine/expand the hypothesis space around structural underpinnings of language learning.
0
1
9
@isabelpapad
Isabel Papadimitriou
2 years
We look at Spanish pronoun drop and Greek subject-verb order, features that can take one form that is English-parallel, and one that is not. We show that mutilingual BERT is biased towards English-parallel forms compared to the monolingual models BETO and GreekBERT.
1
1
9
@isabelpapad
Isabel Papadimitriou
9 months
A fun paper we wrote, with a benchmark + simple ways to check that your data is mostly languagey and not mostly boilerplate! Thanks @iseeaswell for leading!
@iseeaswell
Isaac R Caswell
9 months
Announcing BREAD, a new benchmark for noisy text detection, and CRED, the scoring functions we open-source to solve the problem!
2
5
26
0
0
9
@isabelpapad
Isabel Papadimitriou
2 years
We hope this can inspire more work on fine-grained fluency evaluation for multilingual models! Maybe the community won’t lose sleep over a small subject-verb bias, but what we really want to show is hegemonic language bias in multilingual training, and how this can go unnoticed.
2
0
8
@isabelpapad
Isabel Papadimitriou
4 years
Our classifiers can show us *how* BERT's subjecthood is graded, and we find that features discussed by linguists are also at play in BERT. Passive voice, animacy, and case all play a role in classification, even when we take out the effect of syntactic information.
Tweet media one
1
2
8
@isabelpapad
Isabel Papadimitriou
1 year
Whenever I read a news article about NLP or about greece (the two things I know anything about)...🤦‍♀️ They are not center-right! This is a very dark day, and many greek people are feeling similarly to how you may have felt in 2016.
0
0
8
@isabelpapad
Isabel Papadimitriou
6 months
We need emergency mentors for the NAACL Student Research Workshop! Mentoring is like a chill, co-operative version of reviewing, where you give pre-submission feedback to a student paper. It's a great way to help the research community!
0
3
8
@isabelpapad
Isabel Papadimitriou
1 year
Vocabulary distribution in pretraining also has an effect on top of structural features, showcasing some of the interesting ways that lexicon and structural information intermingle in learnt LM abstractions
Tweet media one
1
1
7
@isabelpapad
Isabel Papadimitriou
2 years
Specifically, we train a binary linear probe for grammatical role: can you predict, based on a contextual embedding, whether onion is a subject or object in “The chef chopped the onion”? The plot shows probe predictions for protypical vs. non-prototypical sentences.
Tweet media one
1
0
6
@isabelpapad
Isabel Papadimitriou
1 year
We find that crossing, non-context-free dependencies in pretraining better equip a model for downstream English learning than nesting Dyck languages (controlling for all language features like dependency lengths and vocabulary). But any structure at all helps, even simple regular
Tweet media one
1
1
6
@isabelpapad
Isabel Papadimitriou
2 years
So, when a word’s role isn’t clear just from its lexical semantics (dashed lines) it actually takes a few layers to classify correctly. Grammatical role in English is almost totally word-order dependent, but that processing isn’t needed for the solid lines (the majority of words)
1
0
5
@isabelpapad
Isabel Papadimitriou
3 years
@iseeaswell and @KreutzerJulia put together this annotation effort and great collaboration across so many people! Thanks!
2
0
5
@isabelpapad
Isabel Papadimitriou
2 years
Is this just because of general position information? We think not: if you train on locally shuffled sentences, performance on non-prototypical sentences really suffers to almost chance (even though the subject still comes before the object in these sentences).
Tweet media one
4
0
5
@isabelpapad
Isabel Papadimitriou
2 years
We see the same effect when we pass almost identical sentences through BERT where we’ve swapped the subject and object positions: the predictions for the same words start out identical, but diverge as grammatical word order effects start showing up in the embedding space.
Tweet media one
1
0
4
@isabelpapad
Isabel Papadimitriou
1 year
@yoavgo I liked this article! Definitely one of the most solid pieces about LLMs I've read that's written by someone from outside NLP
1
1
4
@isabelpapad
Isabel Papadimitriou
4 years
Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT
1
1
4
@isabelpapad
Isabel Papadimitriou
4 years
Our results shed light on the way multilingual models treat higher-order grammatical features at the representation level, and they show how we can use cross-linguistic variation to understand deep neural models of language.
0
1
4
@isabelpapad
Isabel Papadimitriou
2 years
Things get wacky! Hands raising girls, reviews writing writers, tables finding servers (see Appendix for more). Can BERT still tell what subjects and objects are for these sentences? Yes, but it does take a few layers to get there.
1
0
4
@isabelpapad
Isabel Papadimitriou
11 months
Isaac as always pioneering in imaginative ways of helping multilingual NLP. Use FUN-LangID as a quick sanity check to make sure your data is actually in the languages it claims to be in, it's so easy there's no excuse!
@iseeaswell
Isaac R Caswell
11 months
Have you ever wanted a LangID model that works on 1500+ languages? check out FUN-LangID: !
1
8
50
0
0
3
@isabelpapad
Isabel Papadimitriou
3 years
The Greek state is killing a hunger striker. Brutally watching him die, while making Thatcheresque announcements about being blackmailed. We are unfortunately very close to tragedy. He is protesting a retributive illegal prison transfer.
1
0
3
@isabelpapad
Isabel Papadimitriou
4 years
The morphosyntactic alignment of a language influences the way that contextual embeddings in that language are organized -- and this high-order information is robustly transferred cross-lingually.
1
0
3
@isabelpapad
Isabel Papadimitriou
4 years
Classifiers trained on the mBERT representations of Nominative languages classify intransitive subjects in all other languages as a subject, while classifiers trained on Ergative languages are significantly less likely to do so.
Tweet media one
1
0
3
@isabelpapad
Isabel Papadimitriou
3 years
Some articles in english: (super small platform to a handful of nlp researchers, but wanted to broadcast outside of greece)
1
0
3
@isabelpapad
Isabel Papadimitriou
4 years
We train classifiers to distinguish transitive subjects from transitive objects in the mBERT representation space for 24 different languages. How will they classify intransitive subjects, which they've never seen? Turns out it depends on the alignment of the training language!
1
1
2
@isabelpapad
Isabel Papadimitriou
4 years
What factors influence our classifiers' decisions? We might think of subjecthood as a purely syntactic concept, but the linguistics literature shows us that across languages, discourse and semantic features are important in determining subjecthood.
1
0
2
@isabelpapad
Isabel Papadimitriou
9 months
@iwsfutcmd Dude you're so right. I grew up in Greece and California, so hearing a fair amount of Spanish ambiently, but being in Madrid just totally shorts my brain. I don't think it's just the phonemic inventories, it must also be prosodic things as well
0
0
1
@isabelpapad
Isabel Papadimitriou
2 years
@super_cassette Any album! I also like Oscar Peterson Christmas
0
0
2
@isabelpapad
Isabel Papadimitriou
5 months
0
0
2
@isabelpapad
Isabel Papadimitriou
10 months
Amazing work by @jwthickstun and @megha_byte , and a great read as a blog post! Creative research and great resource, that expands how we think about the role of LMs
@jwthickstun
John Thickstun
10 months
@megha_byte and I wrote a blog post on the release of >1,000 dynamic interaction traces between humans and LMs! These traces were collected for HALIE, a Human-AI Language Interactive Evaluation framework recently published in TMLR. Blog: 🧵👇
2
25
81
0
0
1
@isabelpapad
Isabel Papadimitriou
3 years
The police in greece abducts a protestor, and continuously beats him for 8 hours while keeping him hooded and handcuffed. This is of course not (remotely) the first time. Psychological torture, depriving water. The police (and the government behind it) are totally out of control.
@GiraZapatistaBE
La Gira Zapatista RAZB
3 years
Athens: Abduction and torture in GADA Athens. Greece. A personal account by a 21 year old Greek about his torture from police officers after a demonstration against police brutality. He was abducted and tortured in GADA. GADA is the name of Police Head...
0
0
0
0
0
1
@isabelpapad
Isabel Papadimitriou
3 years
Dimitris Koufontinas is on his 48th day of hunger strike, and has been refusing water for the past two days.
0
0
1
@isabelpapad
Isabel Papadimitriou
5 months
@kunalhanda_ Thanks! Fixed! :)
0
0
1
@isabelpapad
Isabel Papadimitriou
5 months
@Sylvia_Sparkle Thanks! Fixed! :)
0
0
1
@isabelpapad
Isabel Papadimitriou
4 years
We examine the property of morphosyntactic alignment (or, ergativity): what is counted as a "subject" in different languages. Nominative languages (like English) treat intransitive subjects like subjects, while ergative languages (like Basque) treat them like objects.
1
0
1