Braintrust Profile Banner
Braintrust Profile
Braintrust

@braintrustdata

Followers
1,702
Following
51
Media
36
Statuses
143

Braintrust is the enterprise-grade stack for building AI products.

Joined August 2023
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@braintrustdata
Braintrust
1 year
Take off 🚀
@ankrgyl
Ankur Goyal
1 year
I'm excited to announce a new product I've been working on called @braintrustdata . Braintrust helps innovative companies and developers ship higher quality AI products by making it easy to run evals. 🔉 on
39
54
420
4
5
47
@braintrustdata
Braintrust
11 months
OpenAI announced: Reproducible outputs 💥 This is game changer for developers! It's now possible to actually evaluate and unit test your LLM apps! You can know that when a test passes locally, it'll pass for your team and in CICD. 1/3🧵
Tweet media one
1
3
21
@braintrustdata
Braintrust
11 months
We made it onto Replit! @Replit
Tweet media one
0
2
21
@braintrustdata
Braintrust
11 months
Our playground now supports all the new OpenAI models and some fun OS models 😎 thanks to @perplexity_ai
3
1
18
@braintrustdata
Braintrust
11 months
Braintrust handles the boring work. Spend your time having fun building AI apps with your team 🥳
Tweet media one
0
7
19
@braintrustdata
Braintrust
4 months
Over the past year, it's been an absolute joy getting to know @mikeknoop and build Braintrust with our friends at @zapier . We worked together on a blog post that captures their workflow. If you want to build a world-class AI product, this is for you!
Tweet media one
1
5
19
@braintrustdata
Braintrust
11 months
We made it to @Huggingface !
Tweet media one
0
3
18
@braintrustdata
Braintrust
11 months
We evaluated Google's text-bison LLM against OpenAI's gpt-3.5-turbo on a SQL generation task in Braintrust. Here's how they performed: - finetuned-gpt3.5: 92.4% - finetuned-bison: 84.2% - gpt3.5: 78.7% - bison: 74.8% (We finetuned both models too!) Dig into the evals below:
2
2
14
@braintrustdata
Braintrust
11 months
. @Retool surveyed how companies are adopting AI. Some of the top challenges: model output accuracy, hallucinations, and prompt engineering. Braintrust help's you solve these challenges: run evaluations, visualize and inspect your results, and experiment with prompts quickly.
Tweet media one
4
3
11
@braintrustdata
Braintrust
11 months
Now, we can have much more fun evaluating our AI apps 🥳 Check out our docs on how to use Braintrust to evaluate your AI app. It's very easy to integrate Braintrust evals with your existing CICD workflow (see our docs below)
Tweet media one
0
0
8
@braintrustdata
Braintrust
10 months
We have some exciting news today 😀!
@ankrgyl
Ankur Goyal
10 months
Not too long ago, I announced a new company called @braintrustdata . Today I'm super excited to share our $5m seed round led by @saammotamedi at @GreylockVC .
Tweet media one
31
21
261
0
2
8
@braintrustdata
Braintrust
11 months
You can self host Braintrust within your own VPC! Learn how in our docs or just reach out to us 👋
Tweet media one
0
2
10
@braintrustdata
Braintrust
7 months
We are super excited to partner with Liucija, Senior Data Scientist on the AI team @Hostinger , as they work towards leveraging AI for use cases like customer support, website building, and more. If you'd like to learn how Braintrust helped Hostinger: - 3x the number of AI
2
1
10
@braintrustdata
Braintrust
7 months
Super fun to host an AIUX Demo Night last week at @eladgil 's office! We are super excited about the future of UIUX w/ AI and loved seeing what talented people are building. Thank you to everyone who came out and special shoutout to our demoers 🙂 If you’re interested in coming
Tweet media one
Tweet media two
1
0
10
@braintrustdata
Braintrust
10 months
LlamaIndex just released Llama Datasets so you can easily benchmark RAG pipelines. We contributed a help desk dataset with Coda so you can easily benchmark chat qa & support use cases. Check it out on Llamahub
Tweet media one
@ankrgyl
Ankur Goyal
10 months
Exciting to collaborate with @llama_index @braintrustdata @siuheihk on a great dataset for developing RAG apps!
1
0
3
0
1
8
@braintrustdata
Braintrust
11 months
The AI app development journey: 1. Start with a prototype and manually test 2. Get tired of manually testing 3. Evaluations enlightenment: add evaluations to your code ??? 4. App is in production. Users rave about your app Braintrust makes it easy to evaluate your AI code.
Tweet media one
1
4
7
@braintrustdata
Braintrust
11 months
🤩 New feature: text blocks in the playground! These blocks just return a constant or variable value without any LLM call. This makes it easy to: - debug your prompts - mock API responses and vectorDB calls
0
1
7
@braintrustdata
Braintrust
11 months
Don't get stuck manually inputting test cases into your LLM app after every prompt change. Braintrust makes it easy to automatically evaluate and test your LLM apps.
Tweet media one
0
1
7
@braintrustdata
Braintrust
11 months
Braintrust also easily integrates with Pytest, Jest, etc. What other testing libraries do you like to use?
Tweet media one
0
8
6
@braintrustdata
Braintrust
1 year
We are hiring for roles including: - Software Engineer — Data Visualization - Software Engineer — Systems - Chief of Staff Learn more here:
1
2
6
@braintrustdata
Braintrust
11 months
🎉We are hiring!
Tweet media one
@ankrgyl
Ankur Goyal
11 months
We're hiring engineers :) Do you love: * building visualizations on text, images, and numbers that (re-)render in <100ms? * searching/grouping billions of rows of semistructured text-heavy data in <200ms? * grinding away LLM latency by any means necessary? If so, LMK
11
17
145
0
5
6
@braintrustdata
Braintrust
11 months
👎 Before: - your app generates different outputs every test - if you use LLMs to grade outputs, those grades would also be random every test 👍 Now w/ reproducible outputs: - your app generates consistent outputs even if temperature !=0 - your model graded evals are consistent
1
0
6
@braintrustdata
Braintrust
10 months
Which LLM is the best at summarizing GitHub issues? We informally tested to find: GPT4>Mistral7b>Claude2.1>GPT3.5 It's easy to run evaluations with Braintrust using our eval libraries and AI proxy. Check out the code below:
Tweet media one
1
0
6
@braintrustdata
Braintrust
11 months
⏰ We added duration stats to experiments! See which test cases were faster or took longer. There's a tradeoff between speed <> quality. Use Braintrust to help you find the optimal balance 😇.
Tweet media one
1
8
6
@braintrustdata
Braintrust
11 months
The modern AI app development workflow
Tweet media one
0
1
6
@braintrustdata
Braintrust
11 months
Spend your time building the fun parts of AI apps w/ Braintrust :)
Tweet media one
0
1
6
@braintrustdata
Braintrust
5 months
🚨 Braintrust shoutout @ 24:45 - thank you @eladgil :)
@TurnerNovak
Turner Novak 🍌🧢
5 months
🎧🍌New @ThePeelPod with @EladGil Stream the full episode here on X or links below Timestamps: 03:46 Building cool monuments 09:12 Fixing education 16:38 Why AI is underhyped 19:02 Four trends to watch in AI 19:55 Why there aren’t large biotech companies 23:21 The current state
8
7
48
0
0
6
@braintrustdata
Braintrust
8 months
The LLM App Stack by a16z. Validation is the most crucial step in building reliable and quality AI apps. Braintrust helps you integrate evals to rapidly ship reliable AI.
Tweet media one
1
0
5
@braintrustdata
Braintrust
11 months
Our prompt playground supports OpenAI function calling and tools now! Try it out on Braintrust.
0
0
4
@braintrustdata
Braintrust
11 months
😍 It's now so easy to use variables in our Playground. We got tired of editing raw JSON so we upgraded our UI to support variable/object inputs better.
0
7
5
@braintrustdata
Braintrust
11 months
Simplify your evaluation scripts with Braintrust. Just define 3 functions: data, task, and scores. We do all the tedious optimizations like parallelizing requests for you.
Tweet media one
0
9
5
@braintrustdata
Braintrust
11 months
@jerryjliu0 @llama_index @FastAPI Need an evaluations script for your AI app? We just opened a PR adding in Braintrust for create-llama to test and evaluate the LLM calls in the templates.
1
0
5
@braintrustdata
Braintrust
10 months
👋 Braintrust makes it easy to eval your AI app
@gdb
Greg Brockman
10 months
evals are surprisingly often all you need
67
82
1K
0
0
4
@braintrustdata
Braintrust
2 months
We are excited to announce Braintrust is now SOC 2 Type II certified! We have supported enterprise customers from day 1, and achieving SOC 2 compliance is further validation of how seriously our team takes governance, risk, and compliance.
0
1
4
@braintrustdata
Braintrust
2 months
We are very excited Braintrust was featured in the inaugural Future 50! We are thankful for the recognition and can’t wait to continue supporting amazing AI teams.
@mariogabriele
Mario Gabriele 🦊💭
2 months
I’m so excited to sareh the Future 50, a database of extraordinary, high-potential startups. A few companies you'll learn about: 🚚 A trucking company doing $45M ARR 🧬 A biotech building "AWS for biology" 🇯🇵 Japan's answer to OpenAI 📈 A payments company that grew 20x in 18
Tweet media one
3
19
103
0
1
4
@braintrustdata
Braintrust
11 months
@goodside This is how we are thinking about deterministic outputs: it's a game changer for developers
@braintrustdata
Braintrust
11 months
OpenAI announced: Reproducible outputs 💥 This is game changer for developers! It's now possible to actually evaluate and unit test your LLM apps! You can know that when a test passes locally, it'll pass for your team and in CICD. 1/3🧵
Tweet media one
1
3
21
0
0
4
@braintrustdata
Braintrust
11 months
Don't have an eval set already? Tired of writing scoring functions? Our `autoevals` library makes it easy to grade your LLM outputs. It includes prebuilt scoring functions: • Model-based (using LLMs) • Heuristic (e.g. Levenshtein distance) • Statistical (e.g. BLEU)
Tweet media one
0
1
4
@braintrustdata
Braintrust
11 months
We have 5 tutorials on how to evaluate AI apps in our docs so far. What other eval examples do you want us to make?
Tweet media one
0
0
4
@braintrustdata
Braintrust
8 months
The Modern AI Stack by Menlo Ventures. "Customers expect and deserve high-quality outputs, and enterprises are smart to be concerned that hallucinations could cause customers to lose trust." Braintrust helps you integrate evals to rapidly ship AI without guesswork.
Tweet media one
1
0
4
@braintrustdata
Braintrust
4 months
New cookbook on how to use the fantastic @ragas_io framework in Braintrust! Among other things, the Braintrust implementation: * Available in both TS and Python * Uses function calling (which substantially boosts performance) * Is fully debuggable
Tweet media one
0
0
4
@braintrustdata
Braintrust
1 year
0
1
3
@braintrustdata
Braintrust
11 months
@braintrustdata
Braintrust
11 months
. @Retool surveyed how companies are adopting AI. Some of the top challenges: model output accuracy, hallucinations, and prompt engineering. Braintrust help's you solve these challenges: run evaluations, visualize and inspect your results, and experiment with prompts quickly.
Tweet media one
4
3
11
0
0
3
@braintrustdata
Braintrust
1 year
🥳New feature: customize your experiment dashboards! Choose only the charts you want to see for your experiment
0
0
3
@braintrustdata
Braintrust
11 months
😎 We just made our experiment sidebar resizable. Now, you can quickly view what you need without having to change pages all the time.
0
0
3
@braintrustdata
Braintrust
11 months
@BorisMPower Braintrust offers a good evaluation framework for LLM apps 🥳
0
0
3
@braintrustdata
Braintrust
11 months
It's so easy to manage test sets and datasets with Braintrust. We made a web UI for editing evals with your team so you don't need to make your own with Google Sheets/Retool. Our TS/Python library also...
Tweet media one
1
0
3
@braintrustdata
Braintrust
11 months
No need to spend all your time testing changes manually after every prompt and pipeline change :)!
Tweet media one
0
0
2
@braintrustdata
Braintrust
6 months
0
0
1
@braintrustdata
Braintrust
11 months
@mathemagic1an @codegen @ThriveCapital 🔥 we just signed up on the waitlist. we'd love to evaluate how it works on our codebase
1
0
2
@braintrustdata
Braintrust
11 months
@DescriptApp These are awesome! We love using Descript and had a lot of fun making our launch video using it!
0
0
2
@braintrustdata
Braintrust
11 months
Braintrust makes it fun to collaborate with your team on AI app development.
0
0
1
@braintrustdata
Braintrust
4 months
0
0
0
@braintrustdata
Braintrust
11 months
And it's easy to read from a dataset from your evaluation script or backend services.
Tweet media one
1
0
1
@braintrustdata
Braintrust
11 months
Braintrust can be run on-prem. Keep your customer data secure when you evaluate and log data for your AI app w/ Braintrust.
@ankrgyl
Ankur Goyal
11 months
1
0
3
0
0
1
@braintrustdata
Braintrust
11 months
finetuned-gpt-3.5: gpt-3.5: finetuned-bison: bison:
1
0
1
@braintrustdata
Braintrust
1 year
@StianWalgermo Of course! Feel free to book here or sign up to try out the product.
0
0
1