![OcciGlot Profile](https://pbs.twimg.com/profile_images/1763183930839121920/McVk7ndw_x96.jpg)
OcciGlot
@occiglot
Followers
231
Following
42
Statuses
32
Open Source Language Models for Europe
Joined February 2024
📣Community Call Contribute to LLM pre-training resources in (your) unrepresented language! Please submit any websites in that language to @CommonCrawl's web language project. They will help increase non-english data in future releases.
0
10
22
For anybody still at #EMNLP, we will be presenting community-Oscar at the MRL poster session at 11am. See you there.
📣Announcing Community-OSCAR: A collaboration between Occiglot and the OSCAR project for creating multilingual Web-crawled datasets. Blog: HF:
0
5
8
RT @MBrack_AIML: For anybody attending KonKis next week, let me make a quick add read for Session 4: "Large AI Models by and for Europe."…
0
2
0
RT @MBrack_AIML: We seek collaborators to extend Community OSCAR to the remaining Common Crawl dumps. If you have the compute/storage (or…
0
3
0
RT @HKydlicek: Great work by @occiglot community on releasing new iteration of multilingual Oscar spanning 40 CC snapshots!! It's so impor…
0
3
0
RT @zilliz_universe: Join @stephenbtl and speakers from @AWS and @occiglot at the Unstructured Data Meetup at the On Cloud Office in Berlin…
0
3
0
@DiscoResearchAI @AIatMeta The model was trained on the 42 supercomputer of @Hessian_AI with support from @DFKI, @TUDarmstadt, @HMWK_Hessen & @BMBF_Bund
0
0
4
@DiscoResearchAI @AIatMeta Shoutout to @bjoern_pl and @MBrack_AIML for training the model and curating the data. Likewise we would like to thank @XYOU and @pjox13 for their amazing work that lead to our dataset in the first place.
0
0
2
RT @clefourrier: New leaderboard: "Occiglot Euro LLM Leaderboard"! It evaluates the performance of LLMs on the following languages: 🇬🇧🇮🇹🇫🇷…
0
5
0
RT @DFKI: OcciGlot - New Open Source Language Models for Europe released 🇪🇺 Researchers from DFKI and @Hessian_AI have launched the @occi…
0
6
0
RT @PhoBoAI: 📢Máme málo neanglických LLM benchmarků! Pojďte se zapojit do @huggingface 🤗komunitního projektu a pojďmě společně přeložit 500…
0
5
0
RT @Dorialexander: Common corpus is an international initiative coordinated by @pleias_fr with the support of the state start-up LANGU:IA,…
0
5
0