10 years ago these days, I was in the French embassy’s language proficiency exam center in Tehran. Given the exam sheet, I noticed that “Persian” is written in front of my “langue maternelle”. I asked to rectify this as my native language is Kurdish and not Persian.
Over the past two weeks, I have received calls from my university,
@nuigalway
, asking about how I am doing. I'd like to thank NUI Galway for checking on students by calling them individually. It's a tough time and the fact that they care so much just feels great.
#QuarantineLife
I received an email earlier today saying that my visa application to attend
#ACL2023NLP
in Canada is approved.
The conference was held five months ago in July 2023! 😑
#NLProc
#AcademicTwitter
>95% of languages are low-resourced in language technology, with languages in the Middle East especially underrepresented.
Help us change this! 🆘🚨📢
If you speak Mazandarani, Gilaki, Tati, Luri, Laki, Shabaki, Achomi, Southern Kurdish, or Hawrami, we need your help!
#NLProc
Super happy to have a paper accepted at
#ACL2023
(main). This paper is special to me as it was my first project as a postdoc with awesome Antoni
@anas_ant
and enjoyed working on it very much! It's gonna be another sweet memory from
#eacl2023
😊
Preprint to appear soon...
#NLProc
They laughed off the issue & told me to be “open minded” cause we are all “Iranian” regardless of “local dialects”. Yeah, Iranian at the cost of linguistic discrimination and humiliation!
Education in and recognition of
#MotherLanguage
is a right not a privilege.
Happy
#IMLD2023
!
I am over the moon to have passed my Ph.D. viva today successfully at
@nuigalway
. It absolutely feels amazing! 🥳🥲
Many thanks to everyone, esp. my awesome supervisor
@johnmccrae
, the wonderful examiners,
@mapukolg
&
@bleepbeepbzzz
and,
@EdwardACurry
for chairing the session.
Kurds working on language technology at
#lreccoling2024
🙂
(With
@peshmerge
who was the first masters student that I supervised. I hope that his work on pos tagging and syntactic parsing for Northern Kurdish inspire more students to work on low-resource challenges.)
I am thrilled to be releasing one of my weekend and late night projects: the Kurdish Language Processing Toolkit ()
This is an
#NLProc
toolkit in Python for the Kurdish language addressing basic language processing tasks such as text preprocessing, stemming
In line with the interesting points raised at
#ACL2022
regarding diversity in
#NLProc
, I'm happy to release my course on computational
#linguistics
on YouTube:
The course is completely online, free and in
#Kurdish
, a less-resource language.
Check it out🤗
Just released version 0.1.3 of the
#Kurdish
Language Processing Toolkit (
#KLPT
). In addition to morphological analysis of
#Sorani
and
#Kurmanji
, you can now stem verbs in Sorani as in:
کڕیومن ← کڕ
دەچینەوە ← چ
Install/update at 🙂
#NLProc
#opensource
Super excited to have joined
@anas_ant
’s thriving
#NLProc
group at
@GeorgeMasonU
as a postdoc researcher. Looking forward to new adventures in developing language technology for less-resourced languages.
Here are a few 📸 of my first day on campus (& I absolutely loved it!)
I am deeply shocked and saddened to hear the devastating news that Thierry Declerck has passed away😢His presence and compassion were a true blessing in our projects and many events that brought us together.
His memory will forever remain in our hearts.
📷Schloss Dagstuhl - 2019
Here is a crazy story that happened to me today!
A few days ago, I shared an old photo of my grandmother in 1960s in traditional
#Kurdish
clothes with a page about Kurdish folklore on Facebook. I initially thought the ladies are sisters. So, the picture was publicly posted on FB
Upon my arrival in Ireland in 2018, I was desperately looking for a place to learn
#Irish
. As a polyglot, learning Irish was not only a passion but also an attempt to be an additional speaker of this endangered language that is being spoken less and less these days. Alas. As a
In beautiful Malta to attend
#EACL2024
and to present our paper "CoDET: A Benchmark for Contrastive Dialectal Evaluation of Machine Translation". This was carried out at my time at George Mason
#NLProc
with my great colleagues
@mahfuzibnalam
and
@anas_ant
:
Thank you,
@paulmurphy_TD
, for raising some of our issues. In fact,
#Ireland
is the only country among the Western and Northern European countries where PhD students are paid less than the minimum wage while paying the highest fees!
Video from
#phdlife
I was around 9 y/o when I first used a
#dictionary
(and "with" was one of the first words that I looked up!). I was in love with words and liked jumping from one entry to another. Little did I know that one day I'd do a thesis on language technology for the same resources!
Sadly, I won't make it to
#ACL2023NLP
in 🇨🇦 due to visa issues, even though I applied early enough and had planned everything😢
Nonetheless, I am happy to present my paper with
@anas_ant
on "Script Normalization for Unconventional Writing"
📝
#NLProc
Super happy to have a paper accepted at
#ACL2023
(main). This paper is special to me as it was my first project as a postdoc with awesome Antoni
@anas_ant
and enjoyed working on it very much! It's gonna be another sweet memory from
#eacl2023
😊
Preprint to appear soon...
#NLProc
Our
#workshop
proposal on "Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia" has been accepted at
@lrec2022
🎉 This will be organized by Atul Kr. Ojha, me,
@johnmccrae
at
@uld_nuig
& Chao-Hong Liu.
Stay tuned for CFP 😊
#NLProc
#EACL2024
in Malta was awesome! I met so many friends and colleagues and made new ones. It’s always with much excitement that I find myself in the
#nlproc
community!
Thanks a lot to everyone who made this happen and looking forward to
#lreccoling
in May!
Great news!🎉PhD students' salary (not stipend!) in France will increase to 1975€ by 2022 according to a new order: This paves the way for fair pay of PhD researchers.
Read about the situation of PhD students in Ireland vs Europe:
We were supposed to meet in Virginia but we ended up in Piraeus!
Such an honor to finally get to meet
@anas_ant
and his adorable baby in person 🤗
I can’t wait joining him as a postdoc at
@GeorgeMasonU
. The transition has been going on since last November due to visa processing!
Probably the best part of my
#PhD
has been working with a wonderful supervisor and learn so much from him, particularly when it comes to expertise, time management skills, wise pragmatism, technicality, approachability and positive and professional attitude.
#grateful
#phdchat
In this blog post, 10 basic but essential SPARQL queries are provided to retrieve
#lexicographical
information from
#Wikidata
. You will learn how to create a simple spelling error detector using this amazing knowledge base⬇️
#linkeddata
#NLProc
Amazing progress on DOLMA-NLP🥳
Some languages have exceeded my expectations, with even very low-resourced varieties like Kolyai & Garusi now represented.
One more week to go! I'm incredibly proud of all the amazing volunteers who’ve made this happen. Let’s finish strong!
#NLProc
>95% of languages are low-resourced in language technology, with languages in the Middle East especially underrepresented.
Help us change this! 🆘🚨📢
If you speak Mazandarani, Gilaki, Tati, Luri, Laki, Shabaki, Achomi, Southern Kurdish, or Hawrami, we need your help!
#NLProc
Tried a few prompts on
#ChatGPT
to generate
#SPARQL
queries. Even though it cannot handle complex queries, it's impressive how it identifies properties!
Anyone in my circle of friends with an interest for SPARQL and
#linkeddata
and up for collaboration? I have a research idea🙂
Our 10-week journey to create parallel corpora for Middle Eastern languages concludes this week with over 50,000 sentences translated by over 40 contributors in 8 languages/varieties!🎉🥲I am proud beyond words.
The best is yet to come!
Stay tuned:
#NLProc
>95% of languages are low-resourced in language technology, with languages in the Middle East especially underrepresented.
Help us change this! 🆘🚨📢
If you speak Mazandarani, Gilaki, Tati, Luri, Laki, Shabaki, Achomi, Southern Kurdish, or Hawrami, we need your help!
#NLProc
📢 DOLMA-NLP Update: Exciting progress🌟
New this week:
- Significant advancements in Hawrami, Laki, Southern Kurdish and Talysh
- Zazaki and Mazandarani translations on the rise
🗣️Do you speak these languages? Please, join our community!
>95% of languages are low-resourced in language technology, with languages in the Middle East especially underrepresented.
Help us change this! 🆘🚨📢
If you speak Mazandarani, Gilaki, Tati, Luri, Laki, Shabaki, Achomi, Southern Kurdish, or Hawrami, we need your help!
#NLProc
🚀 Excited to announce that my project, Developing Technologies for Middle Eastern Languages (DOLMA-NLP), has been accepted into the Stanford Initiative on Language Inclusion and Conservation in Old and New Media (
@siliconstanford
)🎉
More on DOLMA-NLP:
very little to promote the Irish culture, mythology and language. Except some Irish slang words and maybe some signs in public like "go mall", "bruscar" or "fáilte", those who live in Ireland like me don't get a chance to learn about the history and civilization of the country!
Hevalên hêja,
Ger hûn naveroka zanistî bi
#Kurd
î dinivîsin, ez kêfxweş im ku ez ragihînim ku hûn niha dikarin ku XeLaTeX bikar bînin. Ji kerema xwe tevlî civata me bibin ku ji bo zimanê xwe zêdetir naveroka zanistî çê bikin.
#TwitterKurds
📣 Attention European NLP community and language enthusiasts!
We still need language ambassadors to champion contributions to many languages throughout Europe and ensure that your languages aren't left behind.
Learn more about Aya here:
🚀 DOLMA-NLP continues to grow! 🌍
New advancements in Hawrami, Laki, Talysh & Zazaki translations thanks to our amazing community.
Do you speak any of these? Join us in developing machine translation for Middle Eastern languages!
More info:
#NLProc
>95% of languages are low-resourced in language technology, with languages in the Middle East especially underrepresented.
Help us change this! 🆘🚨📢
If you speak Mazandarani, Gilaki, Tati, Luri, Laki, Shabaki, Achomi, Southern Kurdish, or Hawrami, we need your help!
#NLProc
Amazing progress on DOLMA-NLP🥳
Some languages have exceeded my expectations, with even very low-resourced varieties like Kolyai & Garusi now represented.
One more week to go! I'm incredibly proud of all the amazing volunteers who’ve made this happen. Let’s finish strong!
#NLProc
🚀 Great progress of DOLMA-NLP😊
This week for the first time, we passed 5000 translations with Southern Kurdish. Gilaki, Hawrami, Talysh, Laki and Luri Bakhtiari have had impressive progress too.
2 more weeks left. Please help us if you can!
#NLProc
>95% of languages are low-resourced in language technology, with languages in the Middle East especially underrepresented.
Help us change this! 🆘🚨📢
If you speak Mazandarani, Gilaki, Tati, Luri, Laki, Shabaki, Achomi, Southern Kurdish, or Hawrami, we need your help!
#NLProc
📝
#NLProc
for Languages in the Middle East: Project Update
Thanks to those who have reached out so far 🙂Chart shows current progress. Far from done! 😢
Speak any of these languages? We need your help with translation! Please reach out 🙏
#گیلٚکی #کەڵھوڕی #لری
#Zazak
î #مازرونی
>95% of languages are low-resourced in language technology, with languages in the Middle East especially underrepresented.
Help us change this! 🆘🚨📢
If you speak Mazandarani, Gilaki, Tati, Luri, Laki, Shabaki, Achomi, Southern Kurdish, or Hawrami, we need your help!
#NLProc
Just checking the accepted papers at
@LrecColing
, I was wondering which languages were in the spotlight. Among the 1544 accepted papers, 323 (21%) mention at least one language name in their title. Notably, 17 languages (incl. sign ones) appear >=5 times in the titles:
#NLProc
On my way to Germany to give a talk at the University of Cologne about the importance of semantic technologies in elex/
#NLProc
.
I'll be participating in a couple of other events over the upcoming weeks in Europe too:
#EACL
&
#ContribuLing
by
@Wikimedia_Fr
.
Will be glad to meet!🙂
Delighted to attend Peshmerge Morad's master's thesis defense at
@UTwente
, where he addresses part-of-speech tagging for Northern Kurdish (Kurmanji), a project under the supervision of myself and Lorenzo Gatti.
Thrilling times in
#NLProc
with increasing linguistic diversity!
Thanks all for attending
#EURALI
today! Let’s hope that such research communities along with language enthusiasts and linguists change the landscape of
#nlproc
for under-resourced languages in the near future! 🙂
#lreccoling2024
About to kickoff the 2nd Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia (
#EURALI
). Join us at Parigi to learn about many interesting papers!
Here is the plan of the workshop:
#LRECCOLING2024
Excited to be giving a talk tomorrow at the Department of Information Technology in Hewlêr.
If you happen to be around and are interested in language technology, please join us 😊 The event will take place both virtually and in place and it’s in Sorani Kurdish.
As we are approaching
#EACL2023
, I'd like to take a moment to report on two papers of the
@GeorgeMasonNLP
group that we will present at workshops next week. Both address
#NLProc
for less-resourced languages and challenges related to unconventional writing.
See you
@eaclmeeting
😊
Happy to be virtually discussing "the current state of Kurdish language processing" at the 5th International Conference on
#Kurdish
Linguistics tomorrow. I'll be inviting Kurdish linguists to consider computational formalisms to facilitate processing Kurdish thanks to technology.
فەرمانگەی تەکنەلۆجیای زانیاری
@KRGDIT
میوانداری گردبوونەوەیەکی کرد سەبارەت بە زمانەوانی کۆمپیوتەری و پڕۆسەسکردنی زمانی سروشتی. هەروەها د. سینا ئەحمەدی
@sina_ahm
وتەیەکی پێشکەش کرد لەبارەی گرنگی تەکنەلۆجیای زمان بۆ زمانی کوردی.
About to kickoff the 2nd Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia (
#EURALI
). Join us at Parigi to learn about many interesting papers!
Here is the plan of the workshop:
#LRECCOLING2024
#PhDlife
in the
#Netherlands
is just a dream in
#Ireland
!
"Your salary will be €2,395 gross/month in the first year and will increase to €3,061 in the final year, based on full-time employment. We also offer 8% holiday allowance and a year-end bonus of 8.3%."
@pgwanuig
@NUIGSU
Happy
#InternationalMotherLanguageDay
!
As a Kurdish, I had classmates at school beaten for pronouncing incorrectly, notably saying "w" instead of "v". The insecurity that learning a new language had imposed on us was immense.
Don't take education in mother language for granted!
Excited to join
@labo_Loria
@Inria_Nancy
, after my 3-month visit to ATILF where I worked on the alignment of the Trésor de la Langue Française (TLFi: ) & Wiktionnaire. A
#linkeddata
version of TLFi in Ontolex-Lemon is also created.
More to come 🙂
#NLProc
Tu jî di kompîtura xwe de
#Kurd
î dinivîsî?
Mizgîniya xweş! 😊 Niha tu dikarî rastnivîskarê
#Kurmanc
î li ser
#LibreOffice
bikar bînî.
Li vir, bêtir bixwîne ka çawa dikarî wê saz bikî, alîkariya projeyê û piştgiriya min bikî:
@KurdishForTech
@ikram_baban
Me in 2016: I’ll finish my master’s and start making some 💶
Me in 2018: I’ll finish my
#PhD
and start making some 💶
Me in 2022: I’ll finish my
#postdoc
and start making some 💵
What has never changed is that I am as broke as I started this journey! 🥴
#AcademicChatter
DOLMA-NLP has been a huge experience for me. From creating communities to dealing with publishers and automatizing data collection as much as possible. One of the fun parts of the project, however, has been working on a crazy number of varieties of languages, like Garusi.
What is 🤯 is that we, the descendants of those ladies, got to know each other after some 60 years in the most random way thanks to a picture. We are both in the US now yet didn't know anything about each other.
The story of a photo and a scattered family...
Happy to announce that my paper entitled "A Corpus of the Sorani Kurdish Folkloric Lyrics" is accepted at the
@SLTU_CCURL_2020
workshop at
@lrec2020
.I believe that this is a valuable work as
#Kurdish
has a longer history in the oral form rather than written.
#TwitterKurds
#NLProc
📢
#NLProc
for Middle Eastern Languages: Project Update
🚀Exciting progress on our project! Grateful for all contributors and welcoming more awesome new members
🌟We need YOUR expertise! Speak any of these languages? Reach out now!
#کەڵھوڕی #لری
#Zazak
î #گیلٚکی #هەورامی #مازرونی
>95% of languages are low-resourced in language technology, with languages in the Middle East especially underrepresented.
Help us change this! 🆘🚨📢
If you speak Mazandarani, Gilaki, Tati, Luri, Laki, Shabaki, Achomi, Southern Kurdish, or Hawrami, we need your help!
#NLProc
Kudos to all international
#PhD
researchers, who have spent the toughest
#COVID19
lockdowns in Ireland since last year away from their loved ones.
Not to forget that unpaid teaching, underpaid stipends &
#inequality
should never be normal.
#phdlife
@SimonHarrisTD
📷
#Galway
today
“Where do you come from” is such a difficult question when you don’t have a physical home anywhere, speak languages of many countries and are familiar with many cultures at the same time. People often laugh when I say “world” in response, even tough I really mean it!
What will happen to the budget allocated to travel expenses for a PhD student when traveling is no more possible? Shouldn't it be reallocated according to the current
#Covid
conditions where students are stuck in a room & pay extra money for internet and electricity?
#phdlife
Hevalno, ez li pirtûkên ku bi zazakî hatine wergerandin (roman, ne helbest) digerim. Ez dikarim li ku derê pirtûkên weha bibînim?
Arkadaşlar, Zazaca'ya çevrilmiş kitaplar arıyorum (roman, şiir değil). Bu tür kitapları nereden bulabilirim?
#zazaki
#Kirmanck
î
Zazakîaxêvên hêja!
Ma hûn dixwazin teknolojiyê ji bo wergera otomatîkî ya zimanê xwe biafirînin? Ji me re bibin alîkar ku em wê xewnê ji bo we bicîh bînin!🙂
Ji bo wergerandina hevokan ji
#zazak
î pêwîstiya me bi alîkariya we heye. Heke hûn dikarin, ji kerema xwe beşdarî me bibin.
🚀 DOLMA-NLP is growing! 🌍
We’re making big strides in Hawrami, Laki & Gilaki translations thanks to our amazing community.
To accelerate, we’re looking for translated novels in our target languages. Know any?
More info:
#NLProc
>95% of languages are low-resourced in language technology, with languages in the Middle East especially underrepresented.
Help us change this! 🆘🚨📢
If you speak Mazandarani, Gilaki, Tati, Luri, Laki, Shabaki, Achomi, Southern Kurdish, or Hawrami, we need your help!
#NLProc
الجزيرة - مقابلتي حول حوسبة اللهجات العربية ومنصة كراسات، وحجمها ١.٣ مليون كلمة
Al-Jazeera - my interview about the Arabic Dialects Corpus (1.3 million tokens, morphologically annotated) for Palestinian, Lebanese,Yemeni, Iraqi, Libyan, Sudanese