Amogh Mishra
@MishraAmogh
Followers
353
Following
5K
Statuses
2K
Shipping AI agents @ Stealth | Columbia Uni Grad | ex-BNP Paribas, ex-founding engineer @ startup (acq Google) | Biking Enthusiast 🚴
New York, NY
Joined January 2014
@loriberenberg They aren't wrong. The reliability of the tools right now is like that of a toddler driving your car.
1
0
1
@nealkhosla will be exceptionally smart but also fail in simple multi-instruction tasks that humans complete easily.
0
0
0
@bryan_johnson It's interesting how international validation often amplifies issues that Indian voices have been raising for years. When are we gonna listen to our voices within our communities more seriously?
2
1
52
@paraschopra Finally, we have compared AI with modern warfare and not just a productivity enhancer.
0
0
0
I used this to explain to my founder friend to not worry about overcrowding. We're in this for the long haul, and what truly matters is effective distribution and user experience. Let others focus on LLMs; they got their insights on X, not directly from their customers.
@MishraAmogh I think it really depends on the fund strategy. YC often backs founders in the same space, but they are backing 100s of founders at a time and betting on outliers. If you're a more concentrated fund, it doesn't make sense.
0
1
1
@shl Any designer you recommend following? I am excited about changing interfaces from User Experience (UX) to Agent Experience (AX).
0
0
7
vLLM is all you need
How do you currently deploy open LLMs? With @vllm_project, with @kubernetesio? vLLM production-stack is an new open-source batteries included reference implementation from the vLLM project that extends vLLM to production use. 👀 TL;DR: 🔄 Simple cluster deployment with Helm charts, including @grafana Labs, Prometheus 📊 Provides real-time insights into system health with metrics like TTFT, TBT, and throughput in Grafana 🦙Uses vLLM to easily deploy, Llama, Qwen, Gemma, Mistral 🔌 Drop-in replacement for @OpenAI API with router to support multiple models ⚡️Up 3-10x lower response delay and 2-5x higher throughput compared to alternatives 📈 KV Cache sharing powered by the LMCache 🤗 Part of the vLLM Project and open source 🔜 Prefix-aware routing automatically sends queries to nodes with relevant context 🔜 Autoscaling based on vLLM-specific metrics, e.g. throughput
0
0
1