gg42554 Profile Banner
Goran Glavaš Profile
Goran Glavaš

@gg42554

Followers
1K
Following
2K
Statuses
405

Professor for #NLProc @Uni_WUE.

Würzburg, Germany
Joined September 2011
Don't wanna be here? Send us removal request.
@gg42554
Goran Glavaš
24 days
Great new work on multilingual news recommendation (NR) by @iana_andreea! New datasets for multilingual and cross-lingual NR as well as a SotA NR model, new domain-adapted from a multilingual sentence encoder!
@iana_andreea
Andreea Iana
24 days
⚠️Struggling with multilingual news recommendation? We introduce NaSE, a news-adapted sentence encoder!🙌 ✅No costly fine-tuning needed ✅Perfect for cold-start & few-shot scenarios #ecir2025 📰: Try it out @huggingface🤗: 👇
0
0
3
@gg42554
Goran Glavaš
1 month
If you're looking for a good recipe for training a multilingual LVLM or a just a very strong multilingual LVLM to use, supporting 100 languages (built following the identifed "optimal" recipe), check our latest work! @GregorGeigle and Florian Schneider as lead authors!
@GregorGeigle
Gregor Geigle
1 month
Want to train a *multilingual* LVLM but not sure how? Or looking for a strong model to use? Presenting "Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model"! Arxiv: HF Collection:
0
0
2
@gg42554
Goran Glavaš
2 months
Great work by @fschmidt! Afaik, it's the first massively multilingual benchmark for spoken language understanding (and not just topical classification of speech utterances :). Ready "out-of-the-box" on HF datasets. Paper coming soon (but all important details already described).
@fdschmidt
Fabian David Schmidt
2 months
📣Happy to (pre-)release my Fleurs-SLU benchmark to evaluate massively multilingual spoken language understanding on SIB & Belebele. Work done at @Mila_Quebec with @davlanade @gg42554 @licwu Datasets: Details to follow👇
1
0
4
@gg42554
Goran Glavaš
3 months
Tired of work that probes LLMs or uses them as agents? @iana_andreea will present something cool and different: come check her great work on flexible news recommendation.
@iana_andreea
Andreea Iana
3 months
Excited to present MANNeR at @emnlpmeeting on Wednesday 4 pm! Drop by our poster to chat about recommender systems, personalization & beyond-accuracy objectives in news recommendation! #EMNLP2024
0
1
4
@gg42554
Goran Glavaš
3 months
If you're into Vision-LLMS, come check @GregorGeigle's amazing work! See you in Miami ;)
@GregorGeigle
Gregor Geigle
3 months
The monkey's paw worked well, so I will present 2(!) posters at @emnlpmeeting Wednesday at 4pm. I will be easy to spot - just look for the guy with crutches🩼
0
0
1
@gg42554
Goran Glavaš
3 months
Yes, come to @fdschmidt's poster on Tuesday! (even I will be there and I haven't been to a conference in 2.5 years :))
@fdschmidt
Fabian David Schmidt
3 months
Excited to present NLLB-LLM2Vec at @emnlpmeeting Tuesday 2pm! Drop by our poster to chat about multilingual & multimodal research. NLLB-LLM2Vec can now easily be used with @huggingface AutoModels — try it esp. for embedding low-resource languages! 🌐
0
1
8
@gg42554
Goran Glavaš
4 months
RT @iana_andreea: 🔎 What's beneath the surface of encoder architectures in news #recsys? 🤔 Our latest work w/ @gg42554 @heikopaulheim goes…
0
2
0
@gg42554
Goran Glavaš
5 months
If you're looking on the fly customization of your news recommendation function, then MANNeR is the framework for you! Great work by @iana_andreea!
@iana_andreea
Andreea Iana
5 months
🚀 Introducing MANNeR, our modular news recommendation 🤖📰 framework that uses ⚖️ metric-based learning to support on-the-fly customization over multiple aspects at inference time. #emnlp2024 findings: w/ @gg42554 @heikopaulheim @dwsunima (1/⏳️)
0
0
3
@gg42554
Goran Glavaš
6 months
Intermediate code representations like LLVM can indeed be a great facilitator of cross-programming-language transfer for Code-LLMs! Well deserved Oustanding Paper Award for @androneil54 for this great work! It was a pleasure to be part of the effort!
@UKPLab
UKP Lab
6 months
Many interested questions at @androneil54's poster on #IRCoder, an #ACL2024NLP collaboration with @gg42554 (@Uni_WUE) and @IGurevych (@UKPLab) that was just selected as one of this year's @aclmeeting's Outstanding Papers! 🎉🎉🎉
Tweet media one
1
2
20
@gg42554
Goran Glavaš
6 months
RT @UKPLab: Code LMs are improving fast 📈, but they are limited in low-resource programming languages (PLs). 😬 In this #ACL2024NLP paper,…
0
5
0
@gg42554
Goran Glavaš
6 months
I really enjoyed working with @vjhofmann on this! The highlight of this work for me is Figure 6: rendering toponym names from their embeddings obtained from the LM after geoadaptation, we basically obtained the map (for the BCMS area)!
@vjhofmann
Valentin Hofmann
6 months
When we hear someone speak a dialect, we can often tell where they're from. Can LMs do the same? Our #TACL paper addresses this question and shows how to boost LMs' geolinguistic skills. 🌍 This paper has been in the making for almost three years, so glad it's finally out! 🧵
Tweet media one
1
0
6
@gg42554
Goran Glavaš
7 months
@karmake2 @Uni_WUE Thanks for visiting Santu! Thanks for the talk and the discussions ;).
0
0
1
@gg42554
Goran Glavaš
8 months
You can now get our multilingual multi-parallel news recommendation dataset from HuggingFace!
@iana_andreea
Andreea Iana
8 months
🎉 Exciting news! xMIND is now also on @huggingface 🤗 Check it out if you need multi-parallel data for cross-lingual news recommendation or domain-specific text retrieval ⬇️ xMINDlarge: xMINDsmall: w/ @gg42554 @heikopaulheim
0
0
5
@gg42554
Goran Glavaš
8 months
Can your Large Vision-Language Model differentiate tell a Keeshond from a Samoyed? We show that fine-grained object classification is a skill quite complementary to image understanding tested by existing benchmarks and that LVLMs don't excel on the task, to say the least.
@GregorGeigle
Gregor Geigle
8 months
Could you use your Vision-LLM to help identify dogs, plants, dishes, or other things? We investigated and let's just say, do not rely on them when foraging mushrooms in the wild... Paper: Code: 🧵
Tweet media one
0
1
1
@gg42554
Goran Glavaš
8 months
Great effort by @GregorGeigle: we test if explicit grounding objectives reduce hallucination of Large Vision-Language Models. We confirm that they yield better fine-grained image understanding performance, but this does not propagate to less hallucination in open captioning!
@GregorGeigle
Gregor Geigle
8 months
"Grounding tasks improve fine-grained image understanding which helps reduce visual hallucinations in Vision-LLMs" Intuitive claim and often repeated but is it *true*? We tested it in our recent paper: 🧵 (spoiler: no)
0
0
3
@gg42554
Goran Glavaš
8 months
Great work by @iana_andreea who put an immense effort to collect and clean such massively multi-parallel news dataset. I reckon that that such a domain-specific multi-parallel corpus is of quite some interest for the MT folks :)!
@iana_andreea
Andreea Iana
8 months
‼️ Desperately 👀 for multilingual parallel data for #machinetranslation or text retrieval? Look no further! 🙌 Check out PolyNewsParallel on @huggingface! 📰 w/ 833 language pairs over 64 languages & 17 scripts 🌍 🤗 #NLProc @dwsunima ⬇️⬇️
0
0
5
@gg42554
Goran Glavaš
8 months
Check out our massively multilingual and (partially) multi-parallel news dataset PolyNews! Great work by @iana_andreea on compiling this massively multilingual domain-specific data as well as on using it to improve multilingual sentence encoders for news recommendation!
@iana_andreea
Andreea Iana
8 months
🤔 If you're interested in more #multilingual news data for other #NLProc tasks, check out PolyNews 📰 on @huggingface ! w/ 77 low & high-resource languages in 19 scripts 🌍 🤗 📃 w/ @fdschmidt @gg42554 @heikopaulheim @dwsunima
0
0
12
@gg42554
Goran Glavaš
8 months
RT @iana_andreea: 🤔 If you're interested in more #multilingual news data for other #NLProc tasks, check out PolyNews 📰 on @huggingface !…
0
3
0
@gg42554
Goran Glavaš
8 months
MT encoder+LLM in a single end-to-end multilingual model! More effective for cross-lingual transfer than discrete pipelining of the two as in "translate-test"! How do you make Llama "understand" NLLB's representations? Via cheap self-distillation on English data only :)!
@fdschmidt
Fabian David Schmidt
8 months
Introducing NLLB-LLM2Vec! 🚀 We fuse the NLLB encoder & Llama 3 8B trained w/ LLM2Vec to create NLLB-LLM2Vec which supports cross-lingual NLU in 200+ languages🔥 Joint work w/ Philipp Borchert, @licwu, and @gg42554 during my great research stay at @cambridgeltl
Tweet media one
0
0
9
@gg42554
Goran Glavaš
8 months
RT @fdschmidt: Introducing NLLB-LLM2Vec! 🚀 We fuse the NLLB encoder & Llama 3 8B trained w/ LLM2Vec to create NLLB-LLM2Vec which supports…
0
18
0