Since MIT ended negotiations with Elsevier in June 2020, I've declined all new Elsevier reviewer invites. Today I got an editor's thank-you for info I provided on MIT's Framework for Publisher Contracts. A heartwarming reminder: we're all in this together!
This paper is one culmination of three-plus years' work investigating the syntactic capabilities of today's autoregressive language models, with carefully controlled experiments like we would use in a psycholinguistics experiment with human subjects. The results blow me away. 1/5
What syntactic generalizations can be learned from predicting the next word by domain-general learning algorithms? A paper summarizing the case of filler--gap dependencies (+ theoretical implications!) with
@roger_p_levy
and
@rljfutrell
Join us online for the May 13–14 for a star-studded
#NSF
-sponsored workshop: New Horizons in Language Science: Large Language Models, Language Structure, and the Cognitive & Neural Basis of Language! Interdisciplinary talks & discussion on three themes: 1/
Psycholinguistics Twitter: what do you see as the major open puzzles in human language production? Are there recent papers (esp. reviews) articulating these puzzles that you would recommend?
The MIT Computational Psycholinguistics Lab seeks to fill an open postdoc position for an
@MITIBMLab
supported multi-PI project in low-resource language learning, bridging NLP, machine learning, linguistics, and cognitive science. Please spread the word!
I'm proud of our group's presentations at
#CogSci2024
– come check them out!
1: Today 10:30am–noon in J.F. Stall: "Finding structure in logographic writing with library learning" led by
@jiang_gy
. This work won the Sayan Gul award for best undergraduate student paper!
Today at
#CUNY2021
@linguistbrian
and
@fernandaedi
announced the soft opening of Glossa Psycholinguistics. GP is an open access journal that considers brief reports, longer reports, registered reports, and theoretical reviews. Please retweet and follow us here on Twitter!
As MIT faculty and an advisor of graduate students: thank you
#MITGSU
for your important, hard work on this unionization campaign! I applaud your organizing efforts to support graduate students' dignity, autonomy, living, learning, & working conditions, and well-being.
We are proud to publicly announce the MIT Graduate Student Union! MIT grad student-workers are unionizing to create a healthy and fair working and learning environment for all, by giving us a voice in the decisions that affect us. (1/4)
#MITGSU
Video recordings from the
#NSF
-sponsored workshop New Horizons in Language Science: Large Language Models, Language Structure, and the Cognitive & Neural Basis of language are now publicly available! All videos are linked to from the workshop website: 1/
Join us online for the May 13–14 for a star-studded
#NSF
-sponsored workshop: New Horizons in Language Science: Large Language Models, Language Structure, and the Cognitive & Neural Basis of Language! Interdisciplinary talks & discussion on three themes: 1/
Applications are open (due Feb 15) for MIT Brain & Cognitive Sciences' Post-Baccalaureate Research Scholars Program! Two-year, fully funded, intended for individuals from under-represented groups and economically disadvantaged backgrounds. Please retweet!
I urge interested readers to consult the
@weGotlieb
et al. in press: we find that the "GRNN" LSTM of
@xsway_
et al. 2018 trained on a childhood's worth of English shows substantial success on filler–gap dependencies and the island constraints on them. 1/3
It seems like in the recent discussion of the poverty of the stimulus & large language models two versions of PoS are being conflated. Version 1 says human children do not have enough data to learn abstract grammatical representations from *their* input. LLMs have nothing to say
New this year at
#cogsci2020
: authors keep copyright on all conference proceedings papers, and they're licensed Open Access under
@creativecommons
CC BY!
@weGotlieb
's results really move the needle on classic cogsci learnability debates. There's a huge amount of syntactic information in raw linguistic input (just strings!). And generic autoregressive models pick it up! An exciting time for language and computation. 5/5
Our field has lost a giant. Jeff shaped the intellectual landscape I found myself in as an emerging PhD computational psycholinguist, and the university landscape that I found myself in as a new assistant professor twelve years ago.
My talk, "Grammatical generalization and language processing in humans and machines", at the Collège de France colloquium will start shortly: 17.20 CEST. Slides available at . Anyone can join the Zoom seminar at !
Talk titles and abstracts are up now for the May 13–14
#NSF
-hosted workshop New Horizons in Language Science: Large Language Models, Language Structure, and the Cognitive and Neural Basis of Language! Learn more and register for the Zoom webinar at 1/
Yi Ting Huang and I have an opening for a postdoctoral researcher on an NSF-funded project, "Syntactic processing across socioeconomic status: Linking input to comprehension". Apply by Nov 15; start Jul 1, 2024 with flexibility. Please disseminate widely!
One of the most eye-opening studies I've ever been involved in, now out in
#PsychologicalScience
. During the 2016 US presidential campaign, US citizens strongly dispreferred "she" pronouns for the next president, despite expecting the female candidate, Clinton, to win. 1/8
How do English speakers use gendered pronouns when talking about future heads of government whose gender isn't known yet? Find out in this
@MIT
News story about research we conducted during the 2016 US presidential race and the 2017 UK general elections:
@srush_nlp
What a great question! Time yourself reading a document in the language. Reading time is linear in word log probability () so you can back out perplexity.
@whylikethis_
will have slope & intercept as a function of non-native proficiency for you soon 😊
13 years ago, I presented my first paper ever at
#CogSci
. This year I had the honor of serving as a Program Chair.
Cognitive science is changing thanks to LLMs. Where will CogSci be in 13 more years? Maybe an undergrad in the audience last week will run it. Or maybe ChairGPT.
Delighted to see "The Statistical Significance Filter",
@shravanvasishth
Mertzen Jäger
@StatModeling
2018, a Most Downloaded Paper at JML. It's a great paper all psycholinguists should read. 1/3
A best practice I try to follow: when declining a review invitation, suggest as alternate potential reviewers at least two highly qualified colleagues who the editor may not know. It raises awareness of deserving colleagues' expertise, and makes the editor's life easier to boot.
I'm hugely proud to have been a member of the inaugural editorial team for
@glossapsycholx
! It's a top quality operation for original research, with the best
#OpenAccess
terms one can wish for: no author need pay to publish there. Send the journal your best work!
🚨🚨Glossa Psycholinguistics: Our first articles are published! 🚨🚨 We are so excited to publish three excellent original research articles and a statement by
@fernandaedi
and
@linguistbrian
on the goals of GP. Let's highlight the scholarship in our inaugural launch: [1/n]
Congratulations Jenn!!! Prospective students and postdocs interested in language in minds *or* machines: if you get the opportunity to join Jenn's group, take it! She is a rising star and superb to work with.
Excited to share that I will join Johns Hopkins as an Assistant Professor of Cognitive Science in July 2025! I am starting the ✨Group for Language and Intelligence (GLINT)✨, which will study how language works in minds and machines:
Details below 👇 1/4
Postdoc job alert: come work with
@NogaZaslavsky
, Nidhi Seethapathi (
@nidhi_s91
), and me on an integrative computational account of language and locomotion! Apply by March 31 for fullest consideration. Please share widely!
We're looking for a brilliant postdoc to work with
@roger_p_levy
@nidhi_s91
and me on an exciting new project at the intersection of computational cognition, language, and motor control! Please share with anyone who might be interested. More info here:
@lintool
I’ve been thinking lately: what if we replaced author–date format with title–date, e.g. [Lin et al. 2011] -> [Smoothing Techniques 2011]? Or for brevity [SmooTechn 2011]. Title words are probably more useful memory retrieval cues anyway. Bonus: encouraging more creative titles!
This is a terrific paper!
Among other contributions,
@MKeshev
&
@aya_meltzer
’s work highlights how cross-linguistic breadth strengthens psycholinguistics. Their tests of noisy-channel language processing theory use Hebrew grammatical properties that aren’t present in eg English.
Out now in Cognitive Psychology:
Noisy is better than rare: Comprehenders compromise subject-verb agreement to form more probable linguistic structures
with
@aya_meltzer
Featuring "the best explanation of a filler-gap [reviewer2] have even seen!"
1/n
Now out: call for proposals for co-chairs of the 2025 Meeting of the Cognitive Science Society
@cogsci_soc
! Co-chairs shape conference theme & choose invited speakers (but aren't responsible for location-related logistics logistics). Spread the word!
LLMs seem terrible at creating phonetically ambiguous sentences with differing word segmentations, à la the classic "It's not easy to wreck a nice beach" versus "It's not easy to recognize speech". A 99¢ bounty for a prompt that does the trick! (Plus shout-out in my class)
Let me tell you why I'm so excited about this new paper by the amazing
@veroboyce
in
@glossapsycholx
, ""A-maze of natural stories: comprehension and surprisal in the Maze task". 1/9
I'm thrilled to have this paper with
@roger_p_levy
out in
@glossapsycholx
! I hope our methods work on Maze gives more researchers an easy option for collecting incremental reading time data on a range of materials!
Slides for my talk at UCSD's Center for Research in Language, "The learnability of syntactic generalizations from linguistic input: insights from deep learning", can be found at
I just discovered that the 1963 classic Handbook of Mathematical Psychology is available for PDF download on the
@internetarchive
. Several of these chapters are seminal for cognitive scientists of language. What a wonderful resource!
@juanbuis
@Christophepas
I’m a psycholinguist who has studied reading for 15 years, and I respectfully call bullshit (
@callin_bull
). There is no evidence or reason to believe that Bionic Reading’s text tweaks are a good idea that will help you read better. 1/6
We are hiring a postdoctoral associate in Computational Social Science for Scholarly Communications and Open & Equitable Scholarship! Please spread the word.
Yevgeni is an amazing researcher and collaborator, and one of the best mentors I've ever seen in action. And he has a unique, distinctive research program at the intersection of AI/psychology/linguistics that is going great places. If you get the chance to work with him, take it!
Excited to share that in fall 2021 I will start a faculty position at the Technion, where I will continue working on the intersection of NLP and cognitive science. I will be taking on graduate students, RAs and possibly a postdoc. Feel free to get in touch if interested!
Thrilled at publication of
@StephanMeylan
's "How adults understand what young children say", featuring Bayesian noisy-channel inference, LLMs, & child speech datasets!
TL;DR: prior expectations of what kids *want to say* is crucial. (Knowing how kids mispronounce words is too.)
How do adults understand children’s early, highly variable speech? Our new paper in
@NatureHumBehav
() provides evidence that adults’ interpretations depend quite strongly on language expectations—what they think children are likely to say. 1/
Home field advantage!
A measure of linguists
A collective of cognitive scientists
A count of computer scientists
A unit of psychologists
A mass of machine learning
L2 learner English proficiency can be determined from eye movements while reading! Berzak, Katz, & Levy 2018 NAACL preprint now available on ArXiv:
@yevgeni_berzak
Prompting is *not a substitute* for probability measurements in large language models – the amazing
@_jennhu
's EMNLP paper now available camera-ready!
TL;DR: direct probability measurements show LLMs' linguistic generalizations are better than suggested by prompt-based tests.
To researchers doing LLM evaluation: prompting is *not a substitute* for direct probability measurements.
Check out the camera-ready version of our work, to appear at EMNLP 2023! (w/
@roger_p_levy
)
Paper:
Original thread:
🧵👇
"Iconicity and Structure in the Emergence of Combinatoriality" -- with Matthias Hofer, new
@cogsci_soc
conference-proceedings preprint out! As always, thoughts, comments, feedback much appreciated.
“Knowledge should not be accessible only to those who can pay,” said Robert May, chair of UC’s faculty Academic Senate.
A simple and incontrovertible truth. May universities around the world follow UC’s shining example.
New Horizons in Language Science: Large Language Models, Language Structure, and the Cognitive & Neural Basis of Language has started! Theme 1's speakers will present starting NOW. Ben Bergen, Leila Webhe, Ariel Goldstein,
@davidbau
! Tune in on Zoom via
Join us online for the May 13–14 for a star-studded
#NSF
-sponsored workshop: New Horizons in Language Science: Large Language Models, Language Structure, and the Cognitive & Neural Basis of Language! Interdisciplinary talks & discussion on three themes: 1/
TIL one-month-old babies transfer tactile experience from 90 seconds of sucking on a pacifier into visual expectations about the shape of that pacifier. Meltzoff & Borton, 1979. Incredible.
Psycholinguists: what's the best openly available pedagogical illustration using the visual world paradigm showing that interpretations are very quickly modulated by above-word-level context? In my teaching, I use older, non-open materials 😕 and they could use refreshing.
#linguistics
Twitter: what are the best available quantitative measures of dialect/language mutual intelligibility? The more fine-grained, the better: I'm hoping to vividly illustrate at least one specific dialect continuum (e.g., the Romance languages of the Mediterranean coast)
Paper now accepted to
#CogSci2018
and preprint available at -- Communicative Efficiency, Uniform Information Density, and the Rational Speech Act theory. Will be grateful for comments, especially before the May 14 camera-ready final submission deadline!
Linguists & psychologists – is anyone aware of published estimates of the distribution of words typed per day, within or across individuals? Summary statistics like mean or median are fine; more detail, even better!
@siminevazire
@jwpennebaker
@mcxfrank
@dkroy
@RelationScience
@shravanvasishth
Psycholinguists, and cognitive scientists more broadly, please do send your best work to Open Mind! We are rigorous and fast, pure gold Open Access (hence compatible with European funders' new Plan S). Check out our editorial team at
Rolled out of bed 5am & gave a talk for UNIGE Linguistics hosted by
@PaolaMerlo20
🙏. Attendance & great discussion by some of my favorite colleagues all over Europe! So many benefits to remote presentation. I don't want to go back to air travel! Slides at
I am so sad to learn of Akira's untimely passing. Akira was inspired, original, and enterprising, with outstanding future potential. Even more important: he was unfailingly positive, generous, and kind in all our interactions and by all other accounts. I will miss him.
Very sad: the *terrific* young psycholinguist Akira Omaki has died of lymphoma. I worked on a chapter with him when he was a PhD student His work on parsing and acquisition was thoughtful and clever, like this recent Cognition piece
Starting in 5 minutes I'm chairing the
#emnlp2020
Q&A session on Linguistic Theories, Cognitive Modeling, and Psycholinguistics. There are five great papers in this session -- please join!
2020
@CUNYUMass
was a tough act to follow, but the
@CUNY2021
organizers (John Trueswell, Delphine Dahan, Anna Papafragou,
@garicgymro
,
@KathrynSchuler
, Florian Schwarz, Charles Yang) and many student volunteers did it – hats off for a fantastic conference! Some thoughts: 1/
Excellent advice in
@tallinzen
's blog post regarding finding a postdoc. Start contacting PIs you might want to work with *at least* 12 months before your desired start date -- applying for funding requires a long lead time.
I was just talking to someone about a post-doc fellowship for September 2019 that has a deadline in two months, so I thought I'd repost my advice to start thinking about your post-doc options at least a year before graduating
Michael Eisen is apparently being ousted from his role as
@eLife
Editor-in-Chief due to his tweet referencing a satirical article conveying the tragedy of the Israeli–Palestinian conflict. Concerned about this form of censure? Sign this open letter:
Scientists in life/neuro sciences: defend academic freedom. Sign our petition to eLife/HHMI saying Michael Eisen should not be censured for expressing his political opinions. Can be anonymous.
Wondering how to combine the strengths of LLMs and logic-based symbolic methods for natural language reasoning tasks? See our forthcoming EMNLP paper, LINC: Logical Inference via Neurosymbolic Computation!
TL;DR: have the LLM translate to logical form and run a theorem prover!
Today's
#UCMerced
#ResearchWeek
faculty profile is Dr.
@raryskin
who works to understand how the human language processing system allows us to construct meaning from uncertain input.
These results don't mean LLMs are entirely humanlike models of language, but in my view they call into question Poverty of the Stimulus in even its "weaker" form. For English filler–gap dependencies & island constraints, a childhood's worth of input is enough. 3/3
I'm profoundly sad and outraged. These aggressions by members of the police and the National Guard are an abomination. We must hold them accountable in court of law.
#cognitivescience
/
#cogsci
/
@cogsci_soc
community: just a reminder to anonymize your
#CogSci2019
paper and member-abstract submissions, as reviewing will be double blind! Updated templates available at (bottom of the page)
Our
#CogSci2018
preprint now available on arXiv! Gauthier, Levy, and Tenenbaum: Word learning and the acquisition of syntactic-semantic overhypotheses -- enjoy!
Great work from
@veroboyce
on automating and evaluating the Maze task for studying language comprehension -- I'm proud to have worked with her on it and I look forward to using Auto-Maze in many studies to come!
My first peer-reviewed paper got accepted at JML! 'Maze Made Easy: Better and easier measures of incremental processing difficulty' is available with branding () or without (). 1/5
Awesome win for open-source and neurosymbolic AI :)
Combining CoT planning in free-form text w/ interleaved program generation, execution, and repair, small Numina 7b model wins AIMO progress prize, solving challenging competition-level math problems.
👋I boost a lot of job opportunities on here and now it's time to boost my own! I'll be arriving at
#Stanford
in fall of 2024 and I'm looking for awesome people to help me figure out language. 🧵👇
Video recordings from the
#NSF
-sponsored workshop New Horizons in Language Science: Large Language Models, Language Structure, and the Cognitive & Neural Basis of language are now publicly available! All videos are linked to from the workshop website: 1/
Even more impressive, the models respect *island restrictions* on filler–gap dependency. "I know who you said my uncle met __" is OK. *"I know who you said whether my uncle met __" isn't. Configurations providing direct evidence are rare, yet the restriction is learned. 4/5
Elegant study. Graph tells the main story: when test riders in Queensland asked for a free bus ride, drivers granted it far less often to Indian & especially Black customers.
But when drivers were asked in a survey who they'd grant a free ride to, no differences by rider race.
White privilege in everyday market interactions.
'The Colour of a Free Ride' with
@FrijtersPaul
forthcoming in the Economic Journal.
@EJ_RES
Full text:
When MIT, Stanford, Harvard, Yale, Berkeley, Michigan, and Caltech take money from Saudi Arabia, does it offer a productive liberalizing influence, or does just soften the image of an authoritarian state with an abominable human rights record?
#CogSci2018
: Join Open Mind journal editors Naomi Feldman, Lori Holt, Barbara Landau, and myself this afternoon 1-3pm for refreshments at the
@mitpress
booth! Learn about the journal, our rigorous peer review, and how you can support open access publishing in cognitive science.
"...all too many people find themselves living amid a great period of social change, and yet they fail to develop the new attitudes, the new mental responses, that the new situation demands. They end up sleeping through a revolution." -- Martin Luther King
A filler–gap dependency is abstract–a contingency between a word and the presence/absence of ap hrase–and depends on structural hierarchy, not on linear order. Yet the
@xsway_
2018 LSTM, trained on just a human childhood's worth of language, is sensitive to the dependency! 3/5
Calling all researchers who value open & equitable scholarship:
@force11rescomm
seeks feedback on their draft Researcher Bill of Rights & Principles by May 1. They've done great work distilling principles shared across many initiatives. Read & comment at
Looking forward to participating in the new ACL Rolling Review,
@ReviewAcl
– we're in need of peer review innovation, and this is an exciting one! Gratitude to
@gneubig
@astent
@pascalefung
@riedelcastro
for leading this. Read the description at:
This is great work by
@veroboyce
— I am lucky to work with her!!! And, if you want to study incremental processing difficulty during reading, seriously consider trying out our auto-Maze implementation. It compares very favorably thus far with self-paced reading.
The Maze task finds large, localized effects (better than SPR) when run on MTurk with auto-generated distractors. Preprint: Code for auto-generating distractors: Joint work with
@roger_p_levy
and
@rljfutrell
@kevinnadal
@SRCDtweets
+1 to
@kevinnadal
. And, "political lobbying" means attempting to influence legislation; the present issue is executive policy not legislation. Standing up for vulnerable children is advocacy, and in-scope for 501(c)(3)s. So please stand up and advocate!
MIT Computational Psycholinguistics Lab will observe
#ShutDownSTEM
tomorrow. We are not holding our ordinary Wednesday afternoon lab meeting. I'll be devoting the day to self-education and to continuing to develop a plan of action to combat systemic racism.
At first, I was skeptical: I expected that deep-learning language models' apparent performance gains came mostly from learning superficial correlations that happen to repeat across training/test splits, not from learning abstract hierarchical relationships core to language. 2/5
In the future, disseminate your own work in ways that ensure it is freely available to the entire world. Put your paper in an OA repository *before* submitting. Prefer OA journals. Choose an OA license allowing liberal distribution and reuse. 5/6
In which the amazing
@_jennhu
shows that LLMs' linguistic generalizations are better than just prompt-asking them about sentences would suggest. To best reveal what generalizations a language model has acquired, compare the string probabilities they put on minimal pairs!