Check out our new paper with
@jurafsky
! What inductive learning biases influence language learning, and how? We pretrain transformer models on different structures and fine-tune on English to test the learning effects of different inductive biases
🧅 BERT does well without word order, but most sentences are easy! You know that the unordered words chopped+chef+onions describe a chef chopping onions and not the other way around. What if we test on cases where word order matters?
@rljfutrell
@kmahowald
We tend to evaluate multilingual models with downstream tasks. But what if there’s more subtle ways that fluency is affected, like having an English “accent” when learning a second language? Check out our new paper with Kezia Lopez and
@jurafsky
!
Really really excited to be joining
@UBCLinguistics
! I'm so happy to get to work with the lovely people in the department
I'll be going to
@KempnerInst
in the interim, again lucky to work with lovely, interdisciplinary people
I'd love to hang if you're in Boston or Vancouver!
"Learning Music Helps You Read", our
#emnlp2020
paper with
@jurafsky
!
We use transfer learning between language, music, code and parentheses.
Some exciting results about encoding and transfer of abstract structure, and the role (or not!) of recursion
Abstract morphosyntax? In Multilingual Bert?! Yes! Check out our
#EACL2021
paper, where we look at how Multilingual BERT representations are influenced by high-order properties of whole languages (rather than just features of inputs). With
@rljfutrell
@ethanachi
@kmahowald
Isaac does some of the most impactful NLP work that I know! This is not just 'Google LM magic', it's the result of extremely hard-nosed and outside-the-box data work and linguistic work, as well as working with speakers. And the whole combo that makes Isaac Isaac!
Excited to announce that 110 languages got added to Google Translate today! Time for context on these languages, especially the communities who helped a lot over the past few years, including Cantonese, NKo, and Faroese volunteers. Also, a 110-language youtube playlist. 🧵
I'm at NAACL in person (come chat!) and very excited to be giving a keynote at the
@sig_typ
workshop! I'll talk about What we can learn about language from exploring multilingual language models. Bold title, will she hedge in the talk? Come see for yourself this Thursday at 3:30!
Released multilingual datasets often contain corpora that are not the language they claim to be. This paper is a big annotation effort to document this in detail! My takeaway: take a look at sample sentences when using a corpus, and be cautious around multilingual language models
Lovely to work on this with
@ZhengxuanZenWu
and
@AlexTamkin
!
Check out our controlled studies if you've ever wondered about the effects of tokenization, embedding reinitialization, and structural shift in cross-lingual transfer 🍵
What makes it hard for NLP models to learn a new language?
New work "spilling the tea" on crosslingual transfer with
@ZhengxuanZenWu
and
@isabelpapad
!
Thread👇
1/
Looking to crispen up your research narrative, present an in-progress idea, or get supportive feedback on your paper? And present at NAACL?
Submit to the Student Research Workshop! We have tracks for both papers and draft thesis proposals. See you there!
📢 Exciting News! 📢 The Call For Papers for NAACL SRW 2024 is officially open! 🚀
🗓️ Mentoring Program Deadline: Jan 19, 2024
📝 Paper Submission Deadline: Mar 1, 2024
Don't miss this opportunity to showcase your research and contribute to the advancement of NLP! 💡
The NAACL Student Research Workshop is short a few reviewers! Please consider volunteering to review. This is a great way to give back to the community and help students publish at their first *CL conference! We really appreciate it
In our
#NLProc
#acl2022nlp
paper, “When classifying grammatical role, BERT doesn't care about word order... except when it matters”, we compare BERT on an argument role probing task for easy sentences vs. hard (non-prototypical) sentences, where you really do need word order.
Good morning everyone in Mexico for NAACL! Join us at 9:30 for the Student Research Workshop panel!
PhD students and undergrads: come with questions about research, grad school, job market, or anything you might want to hear from our lovely panelists!
Kicking off tomorrow (Tue)
#NAACL2024
with our amazing panel at 9:30-10:30. We’ll chat with
@jieyuzhao11
@navitagoyal_
@shi_weiyan
@krisgligoric
about challenges & advices for student researchers 🎓.
Also, taking a group photo at the end of the poster session.
Come & join us 😎
Language models give us a cool testing ground to examine issues around the learnability of language, and examine/expand the hypothesis space around structural underpinnings of language learning.
We look at Spanish pronoun drop and Greek subject-verb order, features that can take one form that is English-parallel, and one that is not. We show that mutilingual BERT is biased towards English-parallel forms compared to the monolingual models BETO and GreekBERT.
A fun paper we wrote, with a benchmark + simple ways to check that your data is mostly languagey and not mostly boilerplate! Thanks
@iseeaswell
for leading!
We hope this can inspire more work on fine-grained fluency evaluation for multilingual models! Maybe the community won’t lose sleep over a small subject-verb bias, but what we really want to show is hegemonic language bias in multilingual training, and how this can go unnoticed.
Our classifiers can show us *how* BERT's subjecthood is graded, and we find that features discussed by linguists are also at play in BERT. Passive voice, animacy, and case all play a role in classification, even when we take out the effect of syntactic information.
Whenever I read a news article about NLP or about greece (the two things I know anything about)...🤦♀️ They are not center-right! This is a very dark day, and many greek people are feeling similarly to how you may have felt in 2016.
We need emergency mentors for the NAACL Student Research Workshop!
Mentoring is like a chill, co-operative version of reviewing, where you give pre-submission feedback to a student paper.
It's a great way to help the research community!
Vocabulary distribution in pretraining also has an effect on top of structural features, showcasing some of the interesting ways that lexicon and structural information intermingle in learnt LM abstractions
Specifically, we train a binary linear probe for grammatical role: can you predict, based on a contextual embedding, whether onion is a subject or object in “The chef chopped the onion”? The plot shows probe predictions for protypical vs. non-prototypical sentences.
We find that crossing, non-context-free dependencies in pretraining better equip a model for downstream English learning than nesting Dyck languages (controlling for all language features like dependency lengths and vocabulary). But any structure at all helps, even simple regular
So, when a word’s role isn’t clear just from its lexical semantics (dashed lines) it actually takes a few layers to classify correctly. Grammatical role in English is almost totally word-order dependent, but that processing isn’t needed for the solid lines (the majority of words)
Is this just because of general position information? We think not: if you train on locally shuffled sentences, performance on non-prototypical sentences really suffers to almost chance (even though the subject still comes before the object in these sentences).
We see the same effect when we pass almost identical sentences through BERT where we’ve swapped the subject and object positions: the predictions for the same words start out identical, but diverge as grammatical word order effects start showing up in the embedding space.
Our results shed light on the way multilingual models treat higher-order grammatical features at the representation level, and they show how we can use cross-linguistic variation to understand deep neural models of language.
Things get wacky! Hands raising girls, reviews writing writers, tables finding servers (see Appendix for more). Can BERT still tell what subjects and objects are for these sentences? Yes, but it does take a few layers to get there.
Isaac as always pioneering in imaginative ways of helping multilingual NLP. Use FUN-LangID as a quick sanity check to make sure your data is actually in the languages it claims to be in, it's so easy there's no excuse!
The Greek state is killing a hunger striker. Brutally watching him die, while making Thatcheresque announcements about being blackmailed. We are unfortunately very close to tragedy. He is protesting a retributive illegal prison transfer.
The morphosyntactic alignment of a language influences the way that contextual embeddings in that language are organized -- and this high-order information is robustly transferred cross-lingually.
Classifiers trained on the mBERT representations of Nominative languages classify intransitive subjects in all other languages as a subject, while classifiers trained on Ergative languages are significantly less likely to do so.
We train classifiers to distinguish transitive subjects from transitive objects in the mBERT representation space for 24 different languages. How will they classify intransitive subjects, which they've never seen? Turns out it depends on the alignment of the training language!
What factors influence our classifiers' decisions? We might think of subjecthood as a purely syntactic concept, but the linguistics literature shows us that across languages, discourse and semantic features are important in determining subjecthood.
@iwsfutcmd
Dude you're so right. I grew up in Greece and California, so hearing a fair amount of Spanish ambiently, but being in Madrid just totally shorts my brain. I don't think it's just the phonemic inventories, it must also be prosodic things as well
Amazing work by
@jwthickstun
and
@megha_byte
, and a great read as a blog post! Creative research and great resource, that expands how we think about the role of LMs
@megha_byte
and I wrote a blog post on the release of >1,000 dynamic interaction traces between humans and LMs! These traces were collected for HALIE, a Human-AI Language Interactive Evaluation framework recently published in TMLR.
Blog:
🧵👇
The police in greece abducts a protestor, and continuously beats him for 8 hours while keeping him hooded and handcuffed. This is of course not (remotely) the first time. Psychological torture, depriving water. The police (and the government behind it) are totally out of control.
Athens: Abduction and torture in GADA Athens. Greece. A personal account by a 21 year old Greek about his torture from police officers after a demonstration against police brutality. He was abducted and tortured in GADA. GADA is the name of Police Head...
We examine the property of morphosyntactic alignment (or, ergativity): what is counted as a "subject" in different languages. Nominative languages (like English) treat intransitive subjects like subjects, while ergative languages (like Basque) treat them like objects.