![Carlos Lassance Profile](https://pbs.twimg.com/profile_images/1326806439793487875/FC-A-4zD_x96.jpg)
Carlos Lassance
@cadurosar
Followers
438
Following
277
Statuses
249
MTS @ Cohere, constantly trying to make Information Retrieval work better, while making mistakes on the process.
Grenoble
Joined March 2018
RT @nadiinchi: Excited to share that Provence is accepted to #ICLR2025! Provence is a method for training an efficient & high-performing cโฆ
0
5
0
RT @cohere: Today, weโre launching early access for North! Our all-in-one secure AI workspace platform combines LLMs, search, and agents iโฆ
0
99
0
RT @Nils_Reimers: ๐๐๐ฎ๐ง๐๐ก ๐จ๐ ๐๐จ๐ก๐๐ซ๐ ๐๐๐ซ๐๐ง๐ค ๐.๐ - ๐๐จ๐จ๐ฌ๐ญ ๐ฒ๐จ๐ฎ๐ซ ๐๐๐๐ซ๐๐ก ๐ What is new: - Large gains in multilingual retrieval ๐บ๐ณ - Reasoning Caโฆ
0
21
0
RT @Nils_Reimers: Aya-Expanse, the strongest open weights multilingual LLM, was just released by @CohereForAI It beats Llama 70B multilinโฆ
0
41
0
RT @aidangomez: Your search can see now. We're excited to release fully multimodal embeddings for folks to start building with! https://t.โฆ
0
72
0
RT @nadiinchi: Do not miss an application deadline for #ALPS2025 on October 15! ALPS is an Advanced Language Proceโฆ
0
7
0
@antonio_mallia @prithivida @MrParryParry No they are not the same, I was going over the data that @prithivida shared, but from looking at the opensearch post they are not using sparse embed
2
0
0
@antonio_mallia @prithivida @MrParryParry What I mean is that they have more than 1 dimension per sparse embedding. For example SparseEmbed๐ฟ 64 has 64 dimensions per embedding. In my view, this is like storing 64 times the information per token you store on the database
1
0
2
@prithivida @antonio_mallia @MrParryParry Just my two cents: 1. They use less expansion, but way more information, their actual smallest flops on the table is 11.86 (0.74 flops with 16 dims) 2. It is easy to reduce FLOPS in domain, it is hard to make it work OOD (SparseEmbed๐ฟ 64 = SPLADE++ on BEIR)
1
0
1
@antonio_mallia To me is simply a question of Data, data, data. They are using pretraining data and might be using more than just msmarco to train the model (even if not using data that is on BEIR)
1
0
0
@antonio_mallia @MrParryParry Hey Antonio, that's an old study and in in-domain data. FLOPS becomes more important as you go out-of-domain and it is correlated with inference speed (but not perfectly, it really depends on the internal search algorithm).
0
0
1
RT @nadiinchi: I will present our study on Multilingual Retrieval-augmented generation, tomorrow at #ACL2024NLP workshop on Knowledgeable Lโฆ
0
4
0
@rpradeep42 @TREC_RAG @AIatMeta @cohere @GoogleAI @JinaAI_ @mixedbreadai @MSFTResearch @SnowflakeDB @Voyage_AI_ Well there are 100M segments so it takes a bit of time
0
0
1
@rpradeep42 @TREC_RAG @AIatMeta @cohere @GoogleAI @JinaAI_ @mixedbreadai @MSFTResearch @SnowflakeDB @Voyage_AI_ working on that
1
0
4
@srchvrs It also reminds me of HyDE where instead of using the query you use an hypothetical document as your search anchor:
1
0
4
RT @sylvieshi00: Looking for a team lead to join our search team at @cohere working with @Nils_Reimers and many other kind & smart people.โฆ
0
13
0
RT @cohere: Announcing the private beta of our newest foundation embedding model, Cohere Compass: designed specifically for multi-aspect daโฆ
0
51
0
RT @Nils_Reimers: 0โฃ ๐๐จ๐ซ๐ฅ๐ ๐
๐ข๐ซ๐ฌ๐ญ ๐๐ข๐ง๐๐ซ๐ฒ ๐๐๐๐ญ๐จ๐ซ ๐๐๐ญ๐๐๐๐ฌ๐ 1โฃ Happy to annouce the world first ๐๐ข๐ง๐๐ซ๐ฒ ๐๐๐๐ญ๐จ๐ซ ๐๐๐ญ๐๐๐๐ฌ๐ (for educational purposโฆ
0
62
0