Carlos Lassance @cadurosar profile

Carlos Lassance

@cadurosar

Followers

438

Following

277

Statuses

249

MTS @ Cohere, constantly trying to make Information Retrieval work better, while making mistakes on the process.

Grenoble

Joined March 2018

Don't wanna be here? Send us removal request.

Carlos Lassance

@cadurosar

13 days

RT @nadiinchi: Excited to share that Provence is accepted to #ICLR2025! Provence is a method for training an efficient & high-performing c…

0

5

0

Carlos Lassance

@cadurosar

1 month

RT @cohere: Today, we’re launching early access for North! Our all-in-one secure AI workspace platform combines LLMs, search, and agents i…

0

99

0

Carlos Lassance

@cadurosar

2 months

RT @Nils_Reimers: 𝐋𝐚𝐮𝐧𝐜𝐡 𝐨𝐟 𝐂𝐨𝐡𝐞𝐫𝐞 𝐑𝐞𝐫𝐚𝐧𝐤 𝟑.𝟓 - 𝐁𝐨𝐨𝐬𝐭 𝐲𝐨𝐮𝐫 𝐒𝐞𝐚𝐫𝐜𝐡 🚀 What is new: - Large gains in multilingual retrieval 🇺🇳 - Reasoning Ca…

0

21

0

Carlos Lassance

@cadurosar

4 months

RT @Nils_Reimers: Aya-Expanse, the strongest open weights multilingual LLM, was just released by @CohereForAI It beats Llama 70B multilin…

0

41

0

Carlos Lassance

@cadurosar

4 months

RT @aidangomez: Your search can see now. We're excited to release fully multimodal embeddings for folks to start building with! https://t.…

0

72

0

Carlos Lassance

@cadurosar

4 months

RT @nadiinchi: Do not miss an application deadline for #ALPS2025 on October 15! ALPS is an Advanced Language Proce…

0

7

0

Carlos Lassance

@cadurosar

5 months

@antonio_mallia @prithivida @MrParryParry No they are not the same, I was going over the data that @prithivida shared, but from looking at the opensearch post they are not using sparse embed

2

0

Carlos Lassance

@cadurosar

5 months

@antonio_mallia @prithivida @MrParryParry What I mean is that they have more than 1 dimension per sparse embedding. For example SparseEmbed𝐿 64 has 64 dimensions per embedding. In my view, this is like storing 64 times the information per token you store on the database

1

0

2

Carlos Lassance

@cadurosar

5 months

@prithivida @antonio_mallia @MrParryParry Just my two cents: 1. They use less expansion, but way more information, their actual smallest flops on the table is 11.86 (0.74 flops with 16 dims) 2. It is easy to reduce FLOPS in domain, it is hard to make it work OOD (SparseEmbed𝐿 64 = SPLADE++ on BEIR)

1

0

1

Carlos Lassance

@cadurosar

5 months

@antonio_mallia To me is simply a question of Data, data, data. They are using pretraining data and might be using more than just msmarco to train the model (even if not using data that is on BEIR)

1

0

Carlos Lassance

@cadurosar

5 months

@antonio_mallia @MrParryParry Hey Antonio, that's an old study and in in-domain data. FLOPS becomes more important as you go out-of-domain and it is correlated with inference speed (but not perfectly, it really depends on the internal search algorithm).

0

1

Carlos Lassance

@cadurosar

6 months

@alexlimh23 @cohere @Nils_Reimers Awesome to have you join! Looking forward to working together

0

Carlos Lassance

@cadurosar

6 months

RT @nadiinchi: I will present our study on Multilingual Retrieval-augmented generation, tomorrow at #ACL2024NLP workshop on Knowledgeable L…

0

4

0

Carlos Lassance

@cadurosar

10 months

@rpradeep42 @TREC_RAG @AIatMeta @cohere @GoogleAI @JinaAI_ @mixedbreadai @MSFTResearch @SnowflakeDB @Voyage_AI_ Well there are 100M segments so it takes a bit of time

0

1

Carlos Lassance

@cadurosar

10 months

@rpradeep42 @TREC_RAG @AIatMeta @cohere @GoogleAI @JinaAI_ @mixedbreadai @MSFTResearch @SnowflakeDB @Voyage_AI_ working on that

1

0

4

Carlos Lassance

@cadurosar

10 months

@srchvrs It also reminds me of HyDE where instead of using the query you use an hypothetical document as your search anchor:

1

0

4

Carlos Lassance

@cadurosar

10 months

RT @sylvieshi00: Looking for a team lead to join our search team at @cohere working with @Nils_Reimers and many other kind & smart people.…

0

13

0

Carlos Lassance

@cadurosar

10 months

RT @cohere: Announcing the private beta of our newest foundation embedding model, Cohere Compass: designed specifically for multi-aspect da…

0

51

0

Carlos Lassance

@cadurosar

11 months

RT @Nils_Reimers: 0⃣ 𝐖𝐨𝐫𝐥𝐝 𝐅𝐢𝐫𝐬𝐭 𝐁𝐢𝐧𝐚𝐫𝐲 𝐕𝐞𝐜𝐭𝐨𝐫 𝐃𝐚𝐭𝐚𝐛𝐚𝐬𝐞 1⃣ Happy to annouce the world first 𝐁𝐢𝐧𝐚𝐫𝐲 𝐕𝐞𝐜𝐭𝐨𝐫 𝐃𝐚𝐭𝐚𝐛𝐚𝐬𝐞 (for educational purpos…

0

62

0

Carlos Lassance

@cadurosar

11 months

@andersonbcdefg check dm's

0

1