Salman Avestimehr Profile
Salman Avestimehr

@avestime

Followers
1K
Following
87
Statuses
80

Dean's professor of ECE and CS at USC; Co-founder of ChainOpera and TensorOpera

Joined January 2010
Don't wanna be here? Send us removal request.
@avestime
Salman Avestimehr
5 days
This is going to be a great session! @ChainOpera_AI X @eigenlayer = (co-ownership & co-creation) X (verifiably & trust) for decentralized AI agents. What do you think @sreeramkannan ?
Tweet media one
10
7
55
@avestime
Salman Avestimehr
5 days
Github link:
0
0
0
@avestime
Salman Avestimehr
5 days
Automated uncertainty/correctness assessment of LLMs and VLMs is an extremely difficult problem! Over the past few years, numerous methods have been proposed to assess the truthfulness of LLM generations, each with unique strengths, weaknesses, and computational requirements. What does TruthTorchLM offer? - 25+ Truth Methods (including our own MARS ( and LARS ( methods) that are designed to assess the truthfulness of LLM generations, including Google search check, uncertainty estimation, and multi-LLM collaboration techniques. - Seamless integration with Huggingface and LiteLLM with just a single-line code change, you can obtain LLM generations along with their truthfulness scores. - Built-in evaluation & calibration tools, making it easy for researchers to compare new truth-assessment methods against existing benchmarks. - Support for long-form generation truthfulness assessment, an area with significant room for improvement in LLM research. Who is TruthTorchLM for? 🧑‍🔬 Researchers looking to develop and test novel truth methods. 💻 Developers who want to integrate truth scores into their products. TruthTorchLM is fully open-source, and we aim to expand it with more truth methods, new benchmarks, and research papers. Reach us if you like to also join this effort!
0
0
1
@avestime
Salman Avestimehr
10 days
A paradigm shift driven by DeepSeek's success is the reliance on reinforcement learning (RL) over supervised fine-tuning for large language model (LLM) training. Response scoring plays a critical role in evaluating AI-generated outputs and leveraging them for subsequent training steps. Arguably, it is the most crucial component in automating the AI training AI' pathway toward AGI, as well as turstworthy AI. Our sequence of works on MARS ( and now LARS ( that will appear in NAACL'25 provide the BEST response scoring to-date for various families of LLMs (Llama, Gemma, Mistral, etc)!
Tweet media one
8
11
44
@avestime
Salman Avestimehr
14 days
Great post by @hosseeb . Besides the end-users of AI and companies building AI applications for them, I would also add “AI developers” as a big winner of DeepSeek rollout. So far, the dominant foundation models (both open source and closed source) were lead by large entities with tons of compute power to spare. Their approach for building better foundation models relied on “scaling compute”, which makes reproducibility and innovation close to impossible for AI developer community and academia (even if the model is open sourced ). With DeepSeek’s approach, there are now many more research labs, startups, and developers who can participate in building foundation models!
@hosseeb
Haseeb >|<
15 days
DeepSeek's new R1 reasoning model is dragging down the NASDAQ. It dropped 6 days ago but it seems Wall Street is only now digesting what it means. I'm no equity analyst, but a few things I've been thinking about. DeepSeek is a huge deflationary shock to the price of intelligence. R1 is outcompeting OpenAI's O1 model for likely less than 1/20th the cost, and they are doing it with only 32B active parameters (GPT-4 likely used ~220B active parameters according to @SemiAnalysis_). They also fully open sourced all of their models, the distillations, and a comprehensive paper detailing how they did it. Intelligence is now way cheaper than we thought. This is great for all consumers of AI—meaning you and me. So why is the NASDAQ tanking? Remember, the NASDAQ is an index of producers, not consumers. The price of oil plummeting is bad news for oil companies, but it's great for those of us who drive. The fact that NVIDIA and all of the hyperscalers are so overrepresented in the NASDAQ these days means the stock market is structurally long the price of intelligence. So who benefits from this deflationary shock? I think there is one company in particular that is best positioned now. It's now been more than 2 years since the release of ChatGPT, and it's clear that no lab has that much of an edge. It only takes a few months for Google, OpenAI, Anthropic, and now DeepSeek to copy each other and trade spots on the leaderboard. This is partly because these companies all publish research (researchers want glory) and even for stuff that's unpublished, these organizations leak like sieves. Engineers want to know how things work. It's quite literally the most interesting question in the world: what is intelligence made of? Labs are just not able to hide this without military grade secrecy (and none of the best talent wants to work for the military). So we're stuck in this status quo. Everyone is trading places at the top of the leaderboard, nobody has a clear long-term edge, and DeepSeek and Meta are intent on open sourcing their models, which causes closed models to continually depreciate. Even with all this AI spend, there don't seem to be any durable moats. So who does have a structural moat here? Look at OpenAI. Sora is already behind the state of the art on video (Kling and Veo are racing ahead). Dall-E is OK but no longer best in class. They are now betting hard on Operator, which is their agentic model. Operator is supposed to be able to book flights, order food, do agentic stuff for you. But it has significant problems aside from the coherence of the model itself: If you are working directly with one of their partners like Instacart, Operator gets full access. But much of the open web appears to be blocking Operator, and that may get exacerbated if the web is crawling with Operator instances. You also have to keep handing control back and forth to log in and out of services, solve Captchas—it's all quite cumbersome and finnicky. Take Google on the other hand. Gemini is quietly #1 on @lmarena_ai. They are #1 on image generation with Imagen. They are ahead on video with Veo. They aren't doing anything agentic yet—Google is usually the last mover on the sexier stuff—but once they do, they have a huge structural advantage. Google's webcrawler bots already have full license to touch everything on the web. They already have access to your Gmail, calendar, they can easily traverse the web and have cached most of it (DeepResearch shows how easy this is for Google), and they also have the crown jewel of untapped data: Youtube. And, of course, they are uniquely positioned to drive agents directly on Androids. Although Google is spending a ton on compute, and they are still a hyperscaler, Google is net short intelligence. They are a consumer of AI in order to serve their customers. DeepSeek and this intelligence deflation is long term good for Google, as it means their own spend will go down. It's cool to hate on Google these days, but I think Google ends up being the long-term winner here if DeepSeek-R1 spells a secular trend. That said, don't count out OpenAI. They are still the strongest product company, and they've earned trust from consumers and enterprises for always being 3 months ahead of the rest of the market. They basically invented the entire test-time compute paradigm, and o3 is a real breakthrough which has yet to drop. If intelligence is the most valuable resource in the world, being 3 months ahead of the competition is enough to earn themselves a big premium, and huge enduring trust from their customers. So yes, the biggest loser here is NVIDIA. If China is a real player (and NVIDIA is not allowed to export to China), and DeepSeek is massively deflating the price of intelligence, and they were able to do all of this on nerfed H800 chips, then NVIDIA is in trouble. You want to be in the game of selling intelligence. NVIDIA is in the game of selling FLOPs. If the ratio of FLOPs to intelligence goes down, down goes NVIDIA stock. So it goes. And of course, we have to say it: congrats to @deepseek_ai team in wiping out a trillion dollars of equity value from the NASDAQ. That's six OpenAIs in a single day, vaporized. Not bad. 👍
Tweet media one
Tweet media two
5
13
52
@avestime
Salman Avestimehr
17 days
DeepSeek-R1 is great news for decentralized AI! Unlike previous LLM training methods that after pre-training rely on supervised fine-tuning with labeled datasets, it uses primarily RL to achieve great reasoning. In particular, they incorporated an RL stage that combined various reward signals and diverse prompt distributions. In this phase, human feedback played a role in shaping the model's behavior, particularly in capturing complex scenarios. Decentralized AI provides huge opportunity to scale this by bringing many contributors to help building models that are built for them, and making all of them benefit from their creation!
2
3
14
@avestime
Salman Avestimehr
19 days
We’re excited to launch CO-AI (Collaborative AI) Alliance in partnership with amazing projects like @0G_labs, @ionet, @rendernetwork, @Axlflops , @scaling_x , @PhalaNetwork, @Gateway_xyz, @mindnetwork_xyz, and more to be revealed soon! Our mission: Enable Co-Training and Co-Serving AI Agents collaboratively on decentralized community GPUs. This builds on our years of experience in decentralized/federated learning and platform design,as well as LLM pre-training, which scales model training to thousands of decentralized nodes. We previously released Fox-1 model, an LLM trained on decentralized cloud infrastructure, ranking among the top 3 models in the 1-3B parameter range at the time (. Now, the alliance is kicking off collaborative training for the next-generation Fox-2 model on a much larger decentralized infrastructure. Fox-2 will empower community-built AI agents and applications, driving innovation through collective effort. A model built by the community, for the community—shaping the future of decentralized AI agents! Stay tuned:
Tweet media one
18
55
217
@avestime
Salman Avestimehr
26 days
The real "system-level" advantage of decentralization is SCALABILITY. Just as TCP/IP isn't the most optimized traffic routing protocol, it scales efficiently to power the entire Internet. Similarly, decentralized AI can offer far more scalable solutions for model serving and training. The narrative shouldn't be "Can decentralized AI match centralized AI?" but rather, "We can do it much more scalably!" In our ScaleLLM paper (published at EMNLP '25 - , we demonstrated how decentralized AI achieves higher throughput than leading centralized model-serving endpoints like Together AI, DeepInfra, and Fireworks.
Tweet media one
6
20
86
@avestime
Salman Avestimehr
1 month
Have been talking to AI agent builders in the Web3 space, and one of their biggest challenges is the lack of powerful API tooling on existing AI agent launchpads. To truly innovate, they need: 1) Core AI APIs: Not just LLMs—multimodal (image, video, speech) models are critical for creating richer, interactive experiences. 2) Data APIs: Real-time crypto price feeds, on-chain data, decentralized search, and aggregation tools are key for agents operating in web3 ecosystems. 3) Action APIs: Seamless integrations with wallets (MetaMask, Coinbase), trading (DEXs, DeFi protocols), and communication (Discord, Telegram, X). 4) Domain-specific APIs: Finance (e.g., Plaid, on-chain analytics), geo APIs (e.g., maps ), and Health (e.g., Epic). Web3 AI agents need better launchpads and tools to thrive!
9
36
117
@avestime
Salman Avestimehr
1 month
Looking forward to discussing the future of AI agents and the ecosystem that is being built rapidly around it!
@ChainOpera_AI
ChainOpera AI
1 month
Our co-founder @avestime will participate a AMA session talking about AI Agent with @bitgetglobal @virtuals_io and many other good projects. Date: Jan 10th, 17:00(UTC+4) / Jan 10th, 5am Pacific Time X space:
Tweet media one
1
1
6
@avestime
Salman Avestimehr
1 month
Jensen Huang’s insightful comment at CES 2025—"The IT department of every company is going to be the HR department of AI agents in the future"—perfectly aligns with our vision at @ChainOpera_AI. We're building an “AI Agent LinkedIn,” where individuals and organizations can "recruit" the best AI agents to accomplish their tasks efficiently. Through decentralization, "recruiters" can go beyond just hiring—co-owning and co-cultivating these AI agents while sharing in their economic benefits, rather than simply acting as paying users.
Tweet media one
17
26
133
@avestime
Salman Avestimehr
1 month
can't believe how rapidly and fiercely #PalisadesFire is spreading :(
@LizKreutzNews
Liz Kreutz
1 month
The #PalisadesFire has jumped PCH. Here’s our drive going south next to Will Rogers Beach State Park. #PacificPalisades #Pacificpalisadesfire
0
0
5
@avestime
Salman Avestimehr
1 month
Nice coverage of @ChainOpera_AI by @mii4Web3 !
@mii4Web3
mii4
1 month
ChainOpera AI – The Future of AI and Web3 1/ ❇️ ChainOpera AI is changing the game by merging AI and Web3 into a unified ecosystem. If you’re into decentralization and cutting-edge AI, this is the project to watch. Let’s dive into why it matters! ⬇️
0
0
3
@avestime
Salman Avestimehr
1 month
How about turning this into a trading AI agent?
3
0
10