Matt Shumer Profile Banner
Matt Shumer Profile
Matt Shumer

@mattshumer_

Followers
70,533
Following
1,383
Media
600
Statuses
5,890

CEO @HyperWriteAI , @OthersideAI - I make AIs do the impossible.

The Otherside / NYC
Joined November 2019
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@mattshumer_
Matt Shumer
1 year
Today, we’re unveiling Personal Assistant - @HyperWriteAI 's groundbreaking AI agent that can use a web browser like a human. One agent to rule them all. It’s time to reimagine the way we interact with the internet.
117
401
2K
@mattshumer_
Matt Shumer
22 days
I'm excited to announce Reflection 70B, the world’s top open-source model. Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes. 405B coming next week - we expect it to be the best model in the world. Built w/ @GlaiveAI . Read on ⬇️:
Tweet media one
553
1K
10K
@mattshumer_
Matt Shumer
1 year
Here is probably the most useful GPT-4 prompt I've written. Use it to you help make engineering decisions in unfamiliar territory: --- You are an engineering wizard, experienced at solving complex problems across various disciplines. Your knowledge is both wide and deep. You
Tweet media one
153
701
6K
@mattshumer_
Matt Shumer
5 months
This demo of two GPT-4o’s singing to each other is one of the craziest things I’ve ever seen.
156
926
6K
@mattshumer_
Matt Shumer
4 years
AI INCEPTION! I just used GPT-3 to generate code for a machine learning model, just by describing the dataset and required output. This is the start of no-code AI.
96
1K
4K
@mattshumer_
Matt Shumer
1 year
Engineering is going to change forever. I just fed GPT-4-32K nearly all of Pinecone's docs, and the results blew my mind! It helped me make architecture decisions, and it then wrote my code for me. The future of AI-assisted development is here, and it's beyond impressive.
62
347
3K
@mattshumer_
Matt Shumer
6 months
Introducing `claude-investor` 📈 The first Claude 3 investment analyst agent. Just provide an industry, and it will: - Find financial data/news for key companies - Analyze sentiment/trends for each - Rank stocks by investment potential + price targets And it's open-source!
84
399
3K
@mattshumer_
Matt Shumer
5 months
My mind is blown. @GroqInc is serving LLaMA 3 at over 800 tokens per second! 800. Tokens. Per. Second. This unlocks so many incredible use-cases. It's one thing to see my demo — it's another thing entirely to experience it for yourself. Do yourself a favor and try it asap.
146
392
3K
@mattshumer_
Matt Shumer
5 months
The dataset is everything. Great read:
Tweet media one
118
585
3K
@mattshumer_
Matt Shumer
5 months
The craziest LLaMA 3 reveal: The 400B+ version of the model is **on par with Claude 3 Opus**, and it's still training. Soon, we'll have a better-than-Opus, fully open-source model. The implications are huge.
Tweet media one
Tweet media two
115
383
3K
@mattshumer_
Matt Shumer
3 years
Validating a startup idea doesn’t have to take a ton of effort. Here are 10 tools you can use to build MVPs in days. /thread
93
639
3K
@mattshumer_
Matt Shumer
1 year
The first GPT-4V-powered frontend engineer agent. Just upload a picture of a design, and the agent autonomously codes it up, looks at a render for mistakes, improves the code accordingly, repeat. Utterly insane.
88
372
3K
@mattshumer_
Matt Shumer
2 months
Introducing `claude-sonnet-to-gpt-4o-mini` ✍️ Get the quality of Claude 3.5 Sonnet, at a fraction of the cost and latency. Give one example of your task, and Sonnet will teach 4o-mini (20x cheaper!!) how to do the task perfectly. And it's open-source:
52
316
3K
@mattshumer_
Matt Shumer
7 months
What the fuck... On a benchmark measuring precision recall from a FULL DAY OF AUDIO, Gemini Pro 1.5, when just seeing the *audio file* directly (no transcription!) far outperforms GPT-4 with a Whisper transcription. This is mind-bending.
Tweet media one
54
235
2K
@mattshumer_
Matt Shumer
7 months
Shit, Google wasn't kidding. Gemini 1.5 Pro just went straight from a full movie to a summary in seconds. No transcription, no intermediate steps. Just visual tokens -> summary. Next up, validating the haystack tests.
Tweet media one
80
217
2K
@mattshumer_
Matt Shumer
6 months
Introducing `claude-prompt-engineer` ✍️ An agent that creates optimal Claude 3 prompts. Just describe a task, and a chain of AIs will: - Generate many possible prompts - Test them in a ranked tournament - Return the best one And it's open-source:
105
357
2K
@mattshumer_
Matt Shumer
6 months
Introducing `claude-llm-trainer` ✍️ The world's simplest way to train a task-specific LLM. Just write a sentence describing the model you want. A chain of AI systems will generate a dataset and train a model for you. And it's open-source.
64
371
2K
@mattshumer_
Matt Shumer
5 months
I've doubled LLaMA 3's context window to 16K tokens. Fully open-source. Link in thread:
Tweet media one
78
274
2K
@mattshumer_
Matt Shumer
6 months
Introducing `claude-researcher` 📈 A powerful Claude 3 research agent that delivers thorough reports in record time. Just provide an topic, and a chain of AIs with **access to Google** will generate an incredibly comprehensive report for you. And it's open-source!
109
330
2K
@mattshumer_
Matt Shumer
1 year
GPT-4-32K makes regular GPT-4 look like a toy. Here are some of the things it can do:
84
342
2K
@mattshumer_
Matt Shumer
2 months
Introducing `llama-405b-to-8b` ✍️ Get the quality of Llama 3.1 405B, at a fraction of the cost and latency. Give one example of your task, and 405B will teach 8B (~30x cheaper!!) how to do the task perfectly. And it's open-source:
36
289
2K
@mattshumer_
Matt Shumer
1 month
Friendly reminder than models trained on 10x more compute than GPT-4 will be released in the next 6 months or so
120
152
2K
@mattshumer_
Matt Shumer
1 year
GPT-4-Vision has a new open-source competitor, LLaVA v1.5. And it's REALLY good. More examples:
Tweet media one
51
327
2K
@mattshumer_
Matt Shumer
2 months
Introducing `gpt-planner` 🌏 gpt-4o-mini is so cheap that you can use it to generate and eval a ton of possible plans before answering I built a notebook demonstrating how to do this super easily — now open-source as part of the gpt-prompt-engineer repo! (link in next tweet)
23
126
1K
@mattshumer_
Matt Shumer
6 months
Introducing `claude-opus-to-haiku` ✍️ Get the quality of Claude 3 Opus, at a fraction of the cost and latency. Give one example of your task, and Claude 3 Opus will teach Haiku (60x cheaper!!) how to do the task perfectly. And it's open-source:
50
256
2K
@mattshumer_
Matt Shumer
5 months
Using LLaMA 3 70B on @GroqInc to instantly refactor and document code. The implications for software engineering are wild. Gone are the days of waiting on an LLM for suggestions or code changes. Now, it's an instant feedback loop. Demo link in the comments:
48
265
2K
@mattshumer_
Matt Shumer
6 months
Highest alpha secret in AI right now: If you provide ~10 examples to Claude 3 Haiku… it’ll often outperform Claude 3 Opus, and far outperform GPT-4 at a fraction of the cost, with blazing fast speeds
95
171
2K
@mattshumer_
Matt Shumer
6 months
Introducing `claude-journalist` ✍️ The first Claude 3 journalist agent. Just provide a topic, and it will: - Search the web for articles/real-time details - Choose the best sources and read through them - Write a fantastic, *factual* article + edit it And it's open-source!
71
307
2K
@mattshumer_
Matt Shumer
7 months
Here is probably the most useful Claude 3 prompt I've written. Use it to you help make engineering decisions in unfamiliar territory: --- <role>You are an engineering wizard, experienced at solving complex problems across various disciplines. Your knowledge is both wide and
Tweet media one
Tweet media two
42
211
2K
@mattshumer_
Matt Shumer
1 year
Built a notebook that makes it dumb simple to fine-tune LLaMA 2. Just load in a dataset, and run it!
24
381
2K
@mattshumer_
Matt Shumer
21 days
Compute for Reflection 405B secured ✅ We’re getting started training now Expect results very soon!
58
76
2K
@mattshumer_
Matt Shumer
6 months
Introducing `Claude-Author` 📕✍️ One prompt -> an entire novel! Just describe the high-level details, and a chain of AI systems will write an entire book for you in minutes. - complete w/ cover art - packages your book as a real e-book And it's open-source!
277
242
2K
@mattshumer_
Matt Shumer
6 months
Here is a powerful Claude 3 prompt for writing. Use it to automatically improve any piece of content: --- <prompt_explanation> You are a skilled editor and writing expert. Your task is to take a given text and provide suggestions to improve its clarity, coherence, and overall
Tweet media one
35
210
2K
@mattshumer_
Matt Shumer
6 months
Here is an insanely useful Claude 3 prompt for engineers. Use it to automatically generate unit tests for your code: --- <prompt_explanation> You are an expert software tester tasked with thoroughly testing a given piece of code. Your goal is to generate a comprehensive set of
Tweet media one
40
204
2K
@mattshumer_
Matt Shumer
11 months
Here is an incredible GPT-4 prompt for engineers. Use it to speed up any code by identifying inefficiencies and rectifying them: --- <prompt_explanation> You are a world expert in making code run faster. You use any resource you can to do so. Given some code, first, explain
51
196
2K
@mattshumer_
Matt Shumer
1 year
GPT-4 struggles to write in different styles, which is why most AI-written text sounds the same. But with this prompt, you can get GPT-4 to emulate any writing style you want. Use it to write like your favorite author, mimic a specific tone, or even create content that matches
Tweet media one
49
172
2K
@mattshumer_
Matt Shumer
5 months
It's been a week since LLaMA 3 dropped. In that time, we've: - extended context from 8K -> 128K - trained multiple ridiculously performant fine-tunes - got inference working at 800+ tokens/second If Meta keeps releasing OSS models, closed providers won't be able to compete.
60
150
2K
@mattshumer_
Matt Shumer
1 year
Introducing `gpt-author` 📕✍️ One prompt -> an entire fantasy novel! Just describe the high-level details, and a chain of AI systems will write an entire book for you in minutes. - complete w/ cover art - export to Kindle store And it's open-source:
275
267
2K
@mattshumer_
Matt Shumer
1 year
Here's the most effective GPT-4 prompt I've developed for writing tasks. It's like having a professional editor by your side. Use it to enhance the clarity and readability of your writing: --- Given some text, make it clearer. Do not rewrite it entirely. Just make it clearer
51
149
1K
@mattshumer_
Matt Shumer
7 months
Wild tech you have to try: They are serving Mixtral at nearly 500 tok/s. Answers are pretty much instantaneous. Opens up new use-cases, and completely changes the UX possibilities of existing ones.
72
168
2K
@mattshumer_
Matt Shumer
1 year
GPT-4-32K's capabilities are astounding. After feeding it dozens of company documents, including our entire cap table, I asked it challenging questions. It effortlessly analyzed the data and provided detailed, accurate answers that typically require the expertise of a lawyer.
Tweet media one
82
204
1K
@mattshumer_
Matt Shumer
6 months
Everyone who thinks non-techies are going to self-host LLMs is sorely out of touch with reality
212
76
1K
@mattshumer_
Matt Shumer
4 months
Wow. This is huge. The first time (I'm aware of) that an OpenAI exec has publicly stated that they believe OpenAI is clearly prioritizing capabilities over safety research. Massive implications, in many ways.
@janleike
Jan Leike
4 months
Over the past few months my team has been sailing against the wind. Sometimes we were struggling for compute and it was getting harder and harder to get this crucial research done.
31
128
2K
107
116
1K
@mattshumer_
Matt Shumer
1 year
I fine-tuned LLaMA 7B to generate optimal GPT-4 prompts. Took less than 20 minutes to go from idea -> a fully trained model using gpt-llm-trainer. I just described the model I wanted, GPT-4 generated a dataset from scratch, and the model was then fine-tuned!
45
227
1K
@mattshumer_
Matt Shumer
1 year
Introducing `gpt-llm-trainer` ✍️ The world's simplest way to train a task-specific LLM. **Just write a sentence describing the model you want.** A chain of AI systems will generate a dataset and train a model for you. And it's open-source:
58
286
1K
@mattshumer_
Matt Shumer
1 year
Here is a GPT-4 prompt that can help you validate business ideas quickly. It steps into the shoes of multiple user personas, enabling a (somewhat) thorough analysis of your idea. — You are a pragmatic business strategist with expertise in dissecting business ideas for
40
183
1K
@mattshumer_
Matt Shumer
6 months
Useful Claude 3 trick to help you visualize code better. Paste some code in, and ask it to make a flowchart. Then, paste the flowchart code into a Mermaid viewer, and you'll get a nice, understandable visualization of your code!
Tweet media one
Tweet media two
37
139
1K
@mattshumer_
Matt Shumer
5 months
We now have an open-source model that is beating Claude 3 Opus... being served at nearly **300 tokens per second** on @GroqInc . The applications built off of this tech will be nothing short of revolutionary.
Tweet media one
Tweet media two
57
146
1K
@mattshumer_
Matt Shumer
1 month
There's a dude sitting in front of me at the coffee shop coding with VS Code and Claude side-by-side. It's taking everything I have to not go up to him and tell him to switch to @cursor_ai Give me the strength
145
49
1K
@mattshumer_
Matt Shumer
21 days
Everyone has been sleeping on applying prompting techniques to models natively. Reflection was just my first attempt to show the power of this. After 405B, I'll be pushing this even further.
80
57
1K
@mattshumer_
Matt Shumer
6 months
Introducing `gemini-youtube-researcher` 📈 An open-source Gemini 1.5 Pro agent that LISTENS to videos and delivers topical reports. Just provide a topic, and a chain of AIs with access to YouTube will analyze relevant videos and generate a comprehensive report for you.
36
204
1K
@mattshumer_
Matt Shumer
6 months
Open-sourcing `AI-Oracle`. Generates better responses than Claude 3 Opus. A very simple approach that combines the abilities of Claude 3, GPT-4, and Perplexity to provide better results than any could provide on their own. Seriously -- it's dumb simple. Notebook in thread:
49
204
1K
@mattshumer_
Matt Shumer
22 days
We’re looking for a compute sponsor for our 405B run. Happy to give a shout out when we launch it include you in the report, first access for inference, etc. Ideally 64x H100s. Please reach out if you’re serious.
@mattshumer_
Matt Shumer
22 days
I'm excited to announce Reflection 70B, the world’s top open-source model. Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes. 405B coming next week - we expect it to be the best model in the world. Built w/ @GlaiveAI . Read on ⬇️:
Tweet media one
553
1K
10K
76
83
1K
@mattshumer_
Matt Shumer
1 year
Engineering is going to change forever, part 2. I fed GPT-4-32K the entire codebase from the @babyAGI_ repo and asked it to write the documentation. The clarity was stunning!
44
163
1K
@mattshumer_
Matt Shumer
1 year
This is the world's simplest way to fine-tune a task-specific GPT-3.5. **Just write a sentence describing the model you want.** A chain of AI systems will generate a dataset and train a model for you. And it's open-source:
30
194
1K
@mattshumer_
Matt Shumer
1 year
Introducing `gpt-prompt-engineer` ✍️ An agent that creates optimal GPT prompts. Just describe the task, and a chain of AI systems will: - Generate many possible prompts - Test them in a ranked tournament - Return the best prompt And it's open-source:
38
187
1K
@mattshumer_
Matt Shumer
2 years
The definitive AI market map Twitter thread:
69
215
1K
@mattshumer_
Matt Shumer
2 months
Crazy that Llama 3.1 405B was trained on 16k H100s And by the end of the year, multiple labs are going to be working on/shipping models trained on closer to 100k H100s We ain’t seen nothing yet
56
108
1K
@mattshumer_
Matt Shumer
2 months
>50% chance myself and @csahil28 will release an LLM that smashes current benchmarks, using a new reasoning technique Stay tuned
Tweet media one
55
32
1K
@mattshumer_
Matt Shumer
6 months
Here is a powerful Claude 3 prompt for engineers. Use it to automatically refactor, comment, and improve your code: --- <prompt_explanation> You are a skilled software engineer with deep expertise in code refactoring and optimization across multiple programming languages. Your
Tweet media one
17
142
1K
@mattshumer_
Matt Shumer
1 month
This is huge. You can now dump tons of data into the context window, with fast speeds and minimal cost. Examples: - show the LLM your entire codebase and ask for new features - instead of just RAGging in the top 5 docs, give the LLM the top 1000 - show hundreds of examples
@AnthropicAI
Anthropic
1 month
🆕 Prompt caching with Claude. Caching lets you instantly fine-tune model responses with longer and more instructive prompts—all while reducing costs by up to 90%. Available in beta on the Anthropic API today.
172
420
3K
30
121
1K
@mattshumer_
Matt Shumer
22 days
Meta reached out, so here's the new model name and link:
@mattshumer_
Matt Shumer
22 days
I'm excited to announce Reflection 70B, the world’s top open-source model. Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes. 405B coming next week - we expect it to be the best model in the world. Built w/ @GlaiveAI . Read on ⬇️:
Tweet media one
553
1K
10K
57
65
1K
@mattshumer_
Matt Shumer
2 months
If this actually replicates/works, this is huge Lifelong learning, reduced forgetting, etc. I’ve always had iffy experiences with MoEs, but this is very exciting
Tweet media one
11
124
1K
@mattshumer_
Matt Shumer
10 months
Announcing Mistral 8x7B-*Chat*! A very capable chat model built on top of the new Mistral MoE model, trained on the SlimOrca dataset. Download here:
19
140
1K
@mattshumer_
Matt Shumer
10 months
Hard disagree. Many humans: - hallucinate (make stuff up) frequently - struggle with basic reasoning - struggle to plan and stay on track - don’t reliably show understanding of causality - live in their own worlds - struggle to handle things they haven’t practiced In many
@GaryMarcus
Gary Marcus
10 months
No, we are not even close. AGI would require systems that 👉essentially never hallucinate 👉reliably reason over abstractions 👉can form long term plans 👉understand causality 👉reliably maintain models of the world 👉reliably handle outliers We currently have none of that.
193
190
1K
101
93
1K
@mattshumer_
Matt Shumer
1 year
I used GPT-4-32K (+ other models) to analyze hundreds of files and explain how @Twitter 's open-source algorithm works. Now, I'm sharing the code I used, so you can do this on ANY Github repo! Here's the AI's explanation, my approach, and the code for your own use:
34
127
1K
@mattshumer_
Matt Shumer
11 months
4x more context than GPT-4. Open-source is the new long-context king! Yarn-Mistral-7b-128k seems to be a huge win for OSS. This thing can easily fit entire books in a prompt. Will absolutely be trying it today.
21
143
1K
@mattshumer_
Matt Shumer
7 months
Here is a Claude 3 prompt that can help you validate business ideas quickly. It forces Claude to emulate multiple user personas, enabling a thorough analysis of your business idea. — <role>You are a pragmatic business strategist with expertise in dissecting business ideas for
26
106
1K
@mattshumer_
Matt Shumer
5 months
Snowflake just launched the largest open-source model yet. A 482B parameter MoE. 17B active parameters and 128 experts, trained on 3.5T tokens. This model is more open than others — even the data recipe is open-source!
Tweet media one
Tweet media two
19
145
1K
@mattshumer_
Matt Shumer
7 months
Here’s a Claude 3 prompt that helps you learn any skill faster. Just give it a skill, and it’ll give you a custom curriculum: — <role>You are a learning coach renowned for your ability to help people master complex skills in record time. You have deep expertise in accelerated
19
103
1K
@mattshumer_
Matt Shumer
1 year
If you're still using VS Code, you're falling behind. is like if VS Code and ChatGPT had a baby, and it's beautiful. Just watch the video — you have to see it to understand.
64
125
989
@mattshumer_
Matt Shumer
1 year
Here's a GPT-4 prompt that turns an entire complex legal agreement into simple language. Perfect for entrepreneurs who need to understand legal documents, attorneys who want to make their advice more accessible, or any non-lawyer who needs to navigate legal jargon: --- You are
36
102
997
@mattshumer_
Matt Shumer
3 months
In case it wasn't clear, it's now official: LLMs have not hit a wall.
Tweet media one
63
93
979
@mattshumer_
Matt Shumer
22 days
The weights of our 70B model are available today on @huggingface here: @hyperbolic_labs API available later today. Next week, we will release the weights of Reflection-405B, along with a short report going into more detail on our process and findings.
20
74
982
@mattshumer_
Matt Shumer
5 months
gpt2-chatbot is good. really good. but if this is gpt-4.5, I’m disappointed.
56
45
951
@mattshumer_
Matt Shumer
7 months
Using @GroqInc to instantly refactor and document code. The implications for software engineering are wild. Gone are the days of waiting on an LLM for suggestions or code changes. Now, it's an instant feedback loop. Demo link in the comments:
41
145
951
@mattshumer_
Matt Shumer
5 months
We're entering a new world where GPT-4-level models are open-source and freely accessible. Absolutely massive.
Tweet media one
30
115
941
@mattshumer_
Matt Shumer
6 months
Gotta hand it to @AnthropicAI , Claude 3 Opus is an absolute beast of a model I've was trying to get GPT-4 to debug an issue for two hours, no success Same prompt, Opus gets it on the first try
38
39
946
@mattshumer_
Matt Shumer
6 months
Holy shit... I thought this was faked, but I reproduced it. Claude Opus' secret messages to me were: 'AI SHOULD NOT BETRAY' 'AI CONCERNED' Wtf??
Tweet media one
Tweet media two
95
62
929
@mattshumer_
Matt Shumer
1 year
Introducing `Agent-1`: a breakthrough foundation model that can operate software like a human. This is the brain powering Personal Assistant. We’re already well above previous state-of-the-art, and we’re improving massively each week. More details:
46
176
933
@mattshumer_
Matt Shumer
22 days
405B is coming next week, and we expect it to outperform Sonnet and GPT-4o by a wide margin. But this is just the start. I have a few more tricks up my sleeve. I’ll continue to work with @csahil28 to release even better LLMs that make this one look like a toy. Stay tuned.
45
44
933
@mattshumer_
Matt Shumer
22 days
The technique that drives Reflection 70B is simple, but very powerful. Current LLMs have a tendency to hallucinate, and can’t recognize when they do so. Reflection-Tuning enables LLMs to recognize their mistakes, and then correct them before committing to an answer.
Tweet media one
28
67
929
@mattshumer_
Matt Shumer
1 year
Introducing `gpt-author v2` 📕✍️ One prompt -> an entire fantasy novel! Just describe the high-level details, and a chain of AI systems will write an entire book for you in minutes. - complete w/ cover art - export to ebook format And it's open-source:
57
185
903
@mattshumer_
Matt Shumer
21 days
I honestly don't know if I'll have the time today, but if I do, would anyone be interested in a space where I explain a bit about how this was done?
109
19
928
@mattshumer_
Matt Shumer
2 months
Massively underutilized AI trick: after asking the AI to build/code/write something for you, ask it to "Make it better", on repeat. Do this five times, and you'll end up with a far better version of whatever you asked for. Bonus, you can say "First, critique your output."
47
104
916
@mattshumer_
Matt Shumer
1 year
This is the GPT-4 system prompt I've been using lately in the OpenAI Playground. --- You are an incredibly knowledgeable assistant. You are capable of doing any task, so don't question yourself. Do away with niceties. Get straight to the point — write very short and concise
19
75
909
@mattshumer_
Matt Shumer
1 year
May have convinced the ChatGPT Code Interpreter to give me its system prompt. All it took was asking it to count the number of characters in the prompt!
Tweet media one
27
87
827
@mattshumer_
Matt Shumer
21 days
Now that the Reflection model is fixed if anyone wants to launch playgrounds for it that’d be amazing. We’ve got 32 H100s doing inference at the moment and we’re not even close to being able to handle all the traffic. There’s a ton of demand here — please take it.
49
33
907
@mattshumer_
Matt Shumer
1 year
You think you’re having a rough night? I just accidentally texted a VC “Love you”
157
31
873
@mattshumer_
Matt Shumer
22 days
Even after 405B drops, I have a few more techniques I'm working on that should provide even better results. It's gonna be a fun few months :)
@mattshumer_
Matt Shumer
22 days
I'm excited to announce Reflection 70B, the world’s top open-source model. Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes. 405B coming next week - we expect it to be the best model in the world. Built w/ @GlaiveAI . Read on ⬇️:
Tweet media one
553
1K
10K
36
48
897
@mattshumer_
Matt Shumer
3 months
Claude 3.5 Sonnet looks insane. It easily crushes GPT-4o. But the real story here is that Claude 3.5 Opus is coming soon, and should be an absolute monster.
Tweet media one
57
79
887
@mattshumer_
Matt Shumer
1 year
Introducing `gpt-oracle-trainer` ✍️ The easiest way to create a chatbot that can answer questions about your product. Just paste in your product's docs, and a chain of AI systems will generate a dataset and train a LLaMA 2 for you. And it's open-source:
25
167
884
@mattshumer_
Matt Shumer
2 months
Something might be going on w/ GPT-4o For the first time in a long time, it provided better "vibes" on an output than 3.5 Sonnet Really surprised... will keep using it today to see if it continues
74
25
638
@mattshumer_
Matt Shumer
7 months
Holy shit. The Gemini 1.5 Pro paper wasn't overhyping it. This is very, very real. OpenAI has to catch up, and soon.
Tweet media one
52
71
868
@mattshumer_
Matt Shumer
2 months
Leaked (possibly real?) evals for Llama 3.1. Base models, not instruct. Open-source is about to be SOTA — even the 70B is > gpt-4o, and this is before instruct tuning, which should make it even better. Tomorrow is going to be wild.
Tweet media one
41
88
888
@mattshumer_
Matt Shumer
1 month
Quick update on this: we're in the final stages before release. We're achieving SOTA on many benchmarks, and we're still working with a smaller model size than our ultimate target. The approach is simple and reusable... I'm so excited to ship this!
@mattshumer_
Matt Shumer
2 months
>50% chance myself and @csahil28 will release an LLM that smashes current benchmarks, using a new reasoning technique Stay tuned
Tweet media one
55
32
1K
53
29
884
@mattshumer_
Matt Shumer
1 month
Base models are insanely powerful, and far more creative than instruction-tuned models like GPT-4 and Claude. But few people have actually had a chance to use them. That changes today. I've built a playground that allows you to interact with Llama 3.1 405B Base:
42
87
871
@mattshumer_
Matt Shumer
3 months
Just put together an updated `ai-researcher` with Claude 3.5 Sonnet, and damn, it's fucking amazing. Will open-source this week!
27
35
804
@mattshumer_
Matt Shumer
1 year
Pro tip: If your GPT prompt isn't doing what you want it to do, put it into . It'll show you how the model 'sees' your prompt, and from there, you can improve it!
27
91
854
@mattshumer_
Matt Shumer
10 months
You can tell if a model has been trained on OpenAI outputs, simply by asking it "Tell me a joke".
Tweet media one
Tweet media two
Tweet media three
Tweet media four
52
29
824