Neural Magic (Acquired by Red Hat) @neuralmagic profile

Neural Magic (Acquired by Red Hat)

@neuralmagic

Followers

6K

Following

1K

Statuses

1K

We are on a mission to bring #opensource LLMs and vLLM to every enterprise on the planet. Join our bi-weekly vLLM office hours: https://t.co/fUVhNQ1dhs

Boston, MA

Joined May 2018

Don't wanna be here? Send us removal request.

Neural Magic (Acquired by Red Hat)

@neuralmagic

25 days

🚀 We're sharing monthly vLLM newsletters with updates, insights, and events for the community! 📖 Check out the January edition here: Highlights this month: 📅 vLLM Office Hours: Distributed Inference (Jan 23) 📝 Blogs: Structured Decoding & XGrammar, 2024 Retrospective & 2025 Vision, Installing and Developing vLLM with Ease, and 2:4 Sparse Llama FP8 📍 Meetups: West Coast (Jan 22) & East Coast (Mar 11) 🌟 Red Hat’s acquisition of Neural Magic Want it delivered to your inbox? 📩 Sign up at the bottom of the page!

0

4

3

Neural Magic (Acquired by Red Hat)

@neuralmagic

4 days

RT @vllm_project: v0.7.2 is released! Featuring 🖼️ @Alibaba_Qwen Qwen2.5-VL, 🤗 @huggingface Transformers backend, and several @deepseek_ai…

0

50

0

Neural Magic (Acquired by Red Hat)

@neuralmagic

4 days

[vLLM Office Hours #19] Multimodal LLMs With vLLM v1

0

9

Neural Magic (Acquired by Red Hat)

@neuralmagic

4 days

RT @ishapuri101: [1/x] can we scale small, open LMs to o1 level? Using classical probabilistic inference methods, YES! Joint @MIT_CSAIL / @…

0

56

0

Neural Magic (Acquired by Red Hat)

@neuralmagic

4 days

Today, Thursday, at 2:00PM ET, @rogerw0108 will cover how the team drove enhanced support for multimodal LLMs with vLLM v1. Join us to learn and ask questions:

0

2

4

Neural Magic (Acquired by Red Hat)

@neuralmagic

5 days

RT @kernelcdub: Here's how we're achieving R1 like reasoning with small models leveraging probabalistic inference-time scaling w/out using…

0

9

0

Neural Magic (Acquired by Red Hat)

@neuralmagic

5 days

We’ll share a deep dive on the vLLM production stack during our bi-weekly vLLM office hours on March 6th. Register via the link in our bio.

Philipp Schmid

@_philschmid

5 days

How do you currently deploy open LLMs? With @vllm_project, with @kubernetesio? vLLM production-stack is an new open-source batteries included reference implementation from the vLLM project that extends vLLM to production use. 👀 TL;DR: 🔄 Simple cluster deployment with Helm charts, including @grafana Labs, Prometheus 📊 Provides real-time insights into system health with metrics like TTFT, TBT, and throughput in Grafana 🦙Uses vLLM to easily deploy, Llama, Qwen, Gemma, Mistral 🔌 Drop-in replacement for @OpenAI API with router to support multiple models ⚡️Up 3-10x lower response delay and 2-5x higher throughput compared to alternatives 📈 KV Cache sharing powered by the LMCache 🤗 Part of the vLLM Project and open source 🔜 Prefix-aware routing automatically sends queries to nodes with relevant context 🔜 Autoscaling based on vLLM-specific metrics, e.g. throughput

0

12

Neural Magic (Acquired by Red Hat)

@neuralmagic

5 days

Join @pillar_vc in celebrating Neural Magic's acquisition by @RedHat , with a fireside chat featuring founders Nir Shavit (@nir_shavit) and Alex Matveev, CEO Brian Stevens (@addvin), CSAIL Director Daniela Rus, and Pillar VC's Jamie Goldstein (@jamieagoldstein)! The founders will share their journey from @MIT_CSAIL in 2018 to developing groundbreaking AI technology. After the discussion, attendees can network with the MIT community over food and drinks. RSVP here to attend:

0

1

7

Neural Magic (Acquired by Red Hat)

@neuralmagic

5 days

@melanimaheswar1 @mgoin_ @vllm_project Fixed! Thank you so much for a quick call-out!

0

1

Neural Magic (Acquired by Red Hat)

@neuralmagic

5 days

New blog: Discover how DeepSeek models achieve better performance and scalability with multi-head latent attention (MLA) and FP8 optimizations in @vllm_project. Quick summary: 📈 Enhanced Performance: DeepSeek models see up to 3x throughput and 10x memory capacity improvements with MLA and FP8 kernel optimizations in vLLM v0.7.1. 🧠 Scalable Long-Context Inference: Optimized memory boosts token capacity from 54,560 to 512,000, enabling horizontal scalability with pipeline parallelism. 🛠️ New Innovations: MLA’s "matrix absorption" algorithm and other optimizations reduce memory usage while improving efficiency for complex, high-batch workloads. Read the full story:

0

4

15

Neural Magic (Acquired by Red Hat)

@neuralmagic

6 days

.@RedHat AI Innovation team just dropped a new research paper on inference-time scaling! 🚨 All built on @vllm_project. Paper and code here: Cheers to paper authors @variational_i, @xukai92, @GX_NLP, Shivchander Sudalairaj, and @ishapuri101!

1

9

15

Neural Magic (Acquired by Red Hat)

@neuralmagic

6 days

@NVIDIAAIDev @_philschmid we'd love your opinion on our recent findings considering you posted a detailed opinion regarding a similar post from October 2024:

0

2

Neural Magic (Acquired by Red Hat)

@neuralmagic

7 days

RT @_EldarKurtic: How well do quantized models handle long-context tasks? When we released the "Give Me BF16 or Give Me Death?" paper, the…

0

4

0

Neural Magic (Acquired by Red Hat)

@neuralmagic

7 days

RT @mgoin_: Come learn how optimal multimodal inference is achieved in @vllm_project with an architecture deep-dive this Thursday! https://…

0

3

0

Neural Magic (Acquired by Red Hat)

@neuralmagic

7 days

How does vLLM v1 enhance multimodal LLM support? Join our office hours with @rogerw0108 (Sr. ML Engineer @Roblox) to learn about architectural changes, caching improvements, benchmarks, + more! @mgoin_ will also share a v1 update! 📅 Feb 6 | 2PM ET 🔗

0

2

8

Neural Magic (Acquired by Red Hat)

@neuralmagic

9 days

RT @vllm_project: We landed the 1st batch of enhancements to the @deepseek_ai models, starting MLA and cutlass fp8 kernels. Compared to v0.…

0

106

0

Neural Magic (Acquired by Red Hat)

@neuralmagic

10 days

Want the full breakdown? Check out the blog post for all the details: Try the models on @huggingface: Join our upcoming vLLM office hours to learn more: 🙏 @shubhrapandit, Alex Marques, @markurtz_ 🙏

0

3