Rajan Vivek @rajan__vivek profile

Rajan Vivek

@rajan__vivek

Followers

133

Following

143

Statuses

22

Member of Technical Staff @ContextualAI. MS CS + AI researcher @stanford. Prev @scale_AI @georgiatech

Joined July 2023

Don't wanna be here? Send us removal request.

Rajan Vivek

@rajan__vivek

21 days

@LangChainAI This is awesome work! For more accurate unit testing, you can swap out gpt-4o-mini for LMUnit, a SOTA unit test scoring model with a free API. (Outperforms gpt-4o and sonnet 3.5) Check it out here

0

Rajan Vivek

@rajan__vivek

21 days

@HamelHusain Check out LMUnit, a state-of-the-art model for LLM unit testing with a free API! It’s more accurate than GPT-4o and Sonnet 3.5

0

3

Rajan Vivek

@rajan__vivek

21 days

@erictang000 haha Chinese is more information dense👀think you're onto something

0

Rajan Vivek

@rajan__vivek

28 days

SOTA performance at every step of the RAG pipeline (and end to end) is crazy. Major props to the @ContextualAI team Also super excited about RAG agents that do tasks on your behalf! (see the spreadsheet example towards the end)

Contextual AI

@ContextualAI

28 days

Today, we’re excited to announce the general availability of the Contextual AI Platform. This is the first enterprise platform designed for building specialized RAG agents to support expert knowledge work. What is a specialized RAG agent? First, a general-purpose AI agent is one designed to automate simple daily tasks like scheduling a meeting or responding to an email. On the other hand, a specialized RAG agent is one designed to augment subject-matter experts performing complex domain-specific work. The Contextual AI Platform allows you to create these agents easily and achieve SOTA accuracy right out of the box. Check out what the Contextual AI Platform can do.

0

13

Rajan Vivek

@rajan__vivek

2 months

Truthfulness in AI isn't one-size-fits-all. For some applications, truth is certain and computationally verifiable (math/logic) but for others it can be anything supported by a trusted source (search/RAG). Instead of a universal benchmark, we need a new evaluation paradigm— natural language unit tests, where each aspect of truthfulness is a testable assertion that humans can define and refine over time for their specific application. We tackle this in our new paper and just released a free API for LMUnit, our SOTA unit test scoring model outperforming GPT-4/Claude (. Would love to hear your thoughts on this paradigm @chamath

0

1

Rajan Vivek

@rajan__vivek

2 months

We’ve known for awhile that LLM evaluation is broken in many ways (biased, noisy, not that correlated w/ vibe checks). Our latest release at @ContextualAI is a huge step towards fixing this— natural language unit tests! We also trained a SOTA model to go with it :) Check it out!

Contextual AI

@ContextualAI

2 months

Introducing LMUnit: Natural language unit testing for LLM evaluation How do you really know if your language model is behaving the way you expect? When evaluation is this critical, your best methodology shouldn't just be vibes. With SOTA results on FLASK & BigGenBench and top-10 on RewardBench, LMUnit brings the rigor and familiarity of traditional software engineering unit testing to LLM evaluation. Read on to learn how we built it and try it for free using our API 👇 🔗 🧵 (1/5)

0

2

12

Rajan Vivek

@rajan__vivek

6 months

When we align AI with algorithms like DPO, we tell the model “solution A is better than solution B.” But this doesn’t specify exactly what about A is better than B. Can we do better with more precise preferences? Yes! Check out this great work by @KarelDoostrlnck @ContextualAI

Karel D’Oosterlinck

@KarelDoostrlnck

6 months

Aligning Language Models with preferences leads to stronger and safer models (GPT3 → ChatGPT). However, preferences (RLHF) contain irrelevant signals, and alignment objectives (e.g. DPO) can actually hurt model performance. We tackle both, leading to a ~2x performance boost.

0

4

Rajan Vivek

@rajan__vivek

11 months

The next generation of RAG! Check out this awesome work by the @ContextualAI team

Contextual AI

@ContextualAI

11 months

Today, we’re excited to announce RAG 2.0, our end-to-end system for developing production-grade AI. Using RAG 2.0, we’ve created Contextual Language Models (CLMs), which achieve state-of-the-art performance on a variety of industry benchmarks. CLMs outperform strong RAG baselines built using GPT-4 and top open-source models like Mixtral, according to our research and customers. Read more in our blog post:

0

16

Rajan Vivek

@rajan__vivek

1 year

RT @johnrso_: What is the best way to learn behaviors from videos? By modeling point trajectories, our method, ATM, helps robots learn even…

0

19

0

Rajan Vivek

@rajan__vivek

1 year

RT @ContextualAI: Correct licensing and attribution is critical when building LLMs for enterprise customers. Here at Contextual we care a l…

0

7

0

Rajan Vivek

@rajan__vivek

1 year

Big shoutout to @kawin, @Diyi_Yang , and @douwekiela for advising this work! Checkout the paper here: (11/11) @stanfordnlp @StanfordAILab

2

3

14