![Julie Shin Choi (she/her) Profile](https://pbs.twimg.com/profile_images/1845505024287510528/rodo5euC_x96.jpg)
Julie Shin Choi (she/her)
@juliechoi
Followers
1K
Following
5K
Statuses
4K
RT @CerebrasSystems: Cerebras is proud to be powering the new Le Chat! We enable Flash Answers to run at over 1,100 tokens/s – 10x faster…
0
88
0
RT @CerebrasSystems: Congrats on Mistral's new Le Chat launch! 10x faster than ChatGPT, Sonnet, DeepSeek. It's instant! ⚡️🟧
0
35
0
@arthurmensch Congratulations on an amazing launch @arthurmensch! Look forward to using Le Chat as my primary chat app. The speed is incredible 🧡
0
0
1
🟧 félicitations @MistralAI ... j'adore Le Chat...
Introducing the all new Le Chat: your ultimate AI sidekick for life and work! Now live on web and mobile!
0
0
2
As someone whose family has dealt with RA, I am so inspired by the great work @MayoClinic has done with Cerebras. This shows how AI can radically help humans manage even those diseases that were formerly considered "incurable." It gives us an entirely better path.
The @MayoClinic Genomic Foundation Model represents a significant step towards precision healthcare. Let's start with a real-world implementation. Rheumatoid arthritis (RA) - a chronic disease which affects millions of people globally. How can we leverage this model to improve treatment outcomes?
1
0
6
Looking forward to a lively chat with @furrier on #theCUBE on Feb 14 at @NYSE Wired's #CMOLeaders event! @theCUBE
Explore the AI-driven marketing world 🚀 Join us for #theCUBE + @NYSE Wired's #CMOLeaders event to discuss real-world AI use cases & the role of agentic AI in shaping the future of modern marketing. Our @furrier will hear from @Microsoft’s Shelli Strand, @CerebrasSystems’ @juliechoi, @nutanix’s @1MandyDhaliwal, and more. 🗓️ Feb 14; Only 2 weeks left! @bjbaumann2014
0
1
5
RT @theCUBE: Let’s hear from the CMOs! 🧑💼 Join us Feb. 14 for #theCUBE + @NYSE Wired’s #CMOLeaders super studio event where @furrier and…
0
2
0
Cerebras is now the serving the world's fastest tokens for #DeepSeek R1 70B. Follow us on LinkedIn to explore ways to get the most out of this new model.
DeepSeek R1 70B is now on Cerebras! - Instant reasoning at 1,500 tokens/s – 57x faster than GPUs - Higher model accuracy than GPT-4o and o1-mini - Runs 100% on Cerebras US data centers
0
2
7
So great to have you @Dr_GGP. Thanks for joining us! Congrats on all the great progress with @CentML_Inc.
Great party by @CerebrasSystems and @GreylockVC at @NeurIPSConf! Thanks, @juliechoi for the invite. #NeurIPS2024 #cerebras
0
0
4
FOMO
Attending #NeurIPS2024? Join @BainCapVC's @ahuangdev, @slaterstich et al for the 'Zero Latency Run Club,' co-hosted with @CerebrasSystems on Thursday, December 12th. Event details & RSVP here:
0
0
1
giving inference. so reasonable
Introducing CePO – a test time reasoning framework for Llama - Llama3.3-70B + CePO outperforms Llama 3.1 405B and approaches GPT-4 & Sonnet 3.5 - CePO enables realtime reasoning. Despite using >10x more tokens, it runs at ~100t/s on Cerebras hardware - CePO is more robust than vanilla CoT & Best-of-N, read our full blog for evals & details
0
0
1
Initial benchmarks of providers of Meta's new Llama 3.3 70B model 📊 Congratulations to @CerebrasSystems , @SambaNovaAI , @GroqInc , @FireworksAI_HQ, @togethercompute, @DeepInfra and @hyperbolic_labs on being fast to launch endpoints! In our independent evaluations, @AIatMeta's Llama 3.3 70B model demonstrates intelligence comparable to OpenAI's GPT-4o and Mistral Large 2, and approaches the capabilities of Claude 3.5 Sonnet and Gemini 1.5 Pro. Llama 3.3 70B sets itself apart with its permissive open-source license and now with the launch of these APIs, the speed and cost which this intelligence can be accessed at. In particular, we are seeing @CerebrasSystems , @SambaNovaAI , @GroqInc set new records for the speed at which this level of intelligence can be accessed with their AI-focused custom chips. Congratulations to @CerebrasSystems for being the fastest endpoint we benchmark with their blazing 2,237 output tokens/s. All endpoints are priced at below $1/M tokens (blended 3:1, input:output price), well below proprietary model endpoints of comparable intelligence (GPT-4o is $4.3 on the same basis). Congratulations to @DeepInfra and @hyperbolic_labs on offering the lowest price endpoints. Providers are generally offering the full 128k context window with the exception of Cerebras who is offering 32k. Llama 3.3 70B provides a clear upgrade path for users of 3.1 70B, currently the most popular open-source model. It is also a potential opportunity for users of Llama 3.1 405B to access comparable intelligence at significantly faster speeds and lower cost, though we recommend extensive testing of your specific use-case before doing so. See below for our analysis of the relative Output Speed vs. Price of these providers and a link to our further analysis on Artificial Analysis 👇
0
0
1
1 TRILLION Parameters #Neurips2024 @CerebrasSystems bringing out the BIG stuff
🚨Cerebras Systems + Sandia National Labs have demonstrated training of a 1 trillion parameter model on a single CS-3 system (!) This is ~1% the footprint & power of an equivalent GPU cluster.
0
1
5
💎 ⭐️
Train 10B+ models in days not months Tomorrow I'll be presenting the 'weight streaming' paper at the @latentspacepod Paper Club! Weight streaming is a training execution flow by Cerebras that separates parameter storage from primary compute. High-level how it works: 1. Model parameters are stored externally in a separate memory service 2. Weights are streamed from external memory to the compute units during the forward and backward passes 3. Compute units calculate activations during the forward pass and activation gradients during the backward pass 4. Weight gradients are streamed back from the compute units to the external memory service 5. The external memory service updates the weights using the gradients and stores them for the next iteration As a result, near linear scaling performance. Link to tomorrow's talk: Link to paper:
0
0
1