![Todd Mostak Profile](https://pbs.twimg.com/profile_images/965684057249013760/LVU3PvR4_x96.jpg)
Todd Mostak
@ToddMostak
Followers
2K
Following
6K
Statuses
1K
Founder and CEO of @heavy_ai (formerly MapD/OmniSci). "In the beginner’s mind there are many possibilities, but in the expert’s there are few."
San Francisco, CA
Joined March 2011
@zied_houidi Wonder if you could programmatically generate the scenarios but then have an LLM turn them into word problems to add some entropy to the prompts?
0
0
0
@karpathy Why not also use these games for RL? Ie imagine LLMs progressively getting smarter at a game like Taboo (and any other game) to generally improve reasoning in the model.
0
0
0
@tomshardware The article mentions this, but the title is clickbait given recent events and people are drawing the wrong conclusion.
0
0
1
@IanCutress 100%. Given PTX is proprietary to Nvidia and as you said effectively NV GPU assembly, are people actually saying this? Adding this to the dumb takes list for this week.
0
0
11
I think most users don’t really have first-hand use cases for reasoning models, and that most queries to ChatGPT or DeepSeek could easily be handled by a non-reasoning models. Note that there are tons of application use cases for strong reasoning models (ie a travel agent that doesn’t bungle bookings), just that most users don’t need to use such models directly unless doing code/math/problems requiring sophisticated logic. That said, I think people love seeing the model think and seeing the reasoning chains is UX that makes people trust and anthropomorphize the models more, but you don’t get this anyway with O1 except for the very abstract summaries of the model’s COT.
1
0
7
@aliastasis @DhaniSriram @DrJimFan I believe that was the number of samples they generated from R1 for distillation of the smaller Llama and Qwen models via supervised fine tuning, not the number of samples used for RL of R1 itself. The paper only refers to the 8,000 steps (policy updates) they used in RL.
1
0
3
@aliastasis @DhaniSriram @DrJimFan I don't believe they have disclosed the cost of RL on top of V3 to generate R1. the $5.6M was just GPU costs to train the base V3 model.
3
0
5
RT @tszzl: you don’t need elegantly solve intelligence on paper when deep networks are just begging to learn
0
13
0
RT @heavy_ai: We are excited to announce the availability of the GPU-accelerated analytics platform on the Nvidia G…
0
1
0
I use GPT-4o Advanced Voice Mode for exploring ideas or concepts on walks or doing work around the house when it’s not convenient to type, and also for a conversational partner for language learning. For these use cases it’s golden, although I agree with others here that the turn detection needs work, plus would love to see the output as it’s spoken.
0
0
1
I've been diving deeper into the @Foursquare Places dataset that was released two weeks ago, and am finding it great not only for accessing individual POIs, but also for performing large-scale analysis of geographic trends. Here I'm using @heavy_ai to join the Places dataset to the NASA Gridded Population of the World (GPW) dataset, containing population estimates per square km for the entire world, using Uber H3 hexagonal spatial indexes as join keys. This allows for easy calculation and visualization of the number of POIs by category per capita. See below for two visualizations comparing the number of places of worship with the number of bars per 1,000 people. You can easily see the Bible belt light up for the former category, but why the high prevalence of both places of worship and drinking establishments in the Upper Midwest?
1
1
11
@simonw I loaded the data into HeavyDB, looks like a quality dataset.
Was super excited this morning to see @Foursquare had open sourced a dataset over 100M POIs. I immediately it into @heavy_ai and am having a lot of fun exploring the data, it looks fantastic both in terms of coverage and accuracy.
0
0
8
Was super excited this morning to see @Foursquare had open sourced a dataset over 100M POIs. I immediately it into @heavy_ai and am having a lot of fun exploring the data, it looks fantastic both in terms of coverage and accuracy.
1
3
22
#30DayMapChallenge Day 3: Polygons New @ORNL US buildings dataset, complete with addresses and height data. Here are building footprints in the Philadelphia area, colored by height in @heavy_ai.
0
1
9
@ashleighlondon How many times does this need to be debunked? They aren’t done counting, California itself only has 56% in. Dems won’t hit the 81 million of 2020 but the final count will be well into the 70s.
0
0
1