![Jack Kelly (@jack-kelly.com on Bluesky) Profile](https://pbs.twimg.com/profile_images/1101483049827614722/TioU1lZ3_x96.png)
Jack Kelly (@jack-kelly.com on Bluesky)
@jack_kelly
Followers
5K
Following
13K
Statuses
7K
Trying to help mitigate climate change using machine learning. Co-founder of @OpenClimateFix. Father. Previously @DeepMind & consulted for @NationalGridESO.
Peckham, London SE15
Joined August 2008
RT @Brianna_R_Pagan: Folks impacted by the recent @planet layoffs please reach out, we have 2 upcoming software engineering positions openi…
0
21
0
RT @EnergySystemsIG: 🚨UPCOMING WEBINAR🚨 "How NOAA Open Data Dissemination Weather Forecasts Are Boosting Grid Operations" Featuring Adrie…
0
2
0
RT @OpenClimateFix: A really interesting #podcast by @CarbonCopy_Pod! Listen to the podcast to learn more about @jack_kelly's idea behind…
0
3
0
@Stu3b3 Thanks so much for describing your use-case, David! Yeah, it is exciting times! To set expectations: it probably will take a while (many months) before we see these performance improvements make their way into stable releases. But it is very exciting, none-the-less 😀.Thx again!
0
0
1
@thundercloudvol Yeah, great points. I'm expecting Python will get in the way a lot! So my current plan is to write a complete Zarr implementation in Rust (with Python bindings). Although the Rust implementation may actually involve tweaking some existing crates, rather than starting from scratch
0
0
2
@al_merose @martin_durant_ Yeah, me too! I've been chatting to Jeremy (TensorStore dev). One of my first tasks in Sept will be to benchmark TensorStore & Zarr-Python. My hunch is that TensorStore is a lot faster than Zarr-Python. But may start to struggle when we're trying to load close to 1M chunks/sec
0
0
1
@kylebarron2 Hey @kylebarron2, if you're interested: We're planning to start holding regular half-hour meetings to discuss speeding up Zarr (possibly using Rust). Your perspectives in the meetings would be super-useful😀 (but no pressure if you're too busy!) Details:
0
0
0
@martin_durant_ You know infinitely more than I do about async Python & Zarr & Rust. I'm still a few months away from being vaguely productive in Rust. So I'd hugely value your guidance! Although I appreciate you're super-busy, so even an occasional chat would be great!
0
0
1
@kylebarron2 @OpenClimateFix Yeah, our issue is that our two largest datasets don't exist in public cloud storage buckets. We FTP our NWP datasets from the Met Office! If these datasets existing in public cloud buckets, in performant data structures then that might change the equation!
0
0
1
@kylebarron2 Hehe, yeah, I still have a lot of Rust to learn! So it'll probably take me a few months to really get productive with Rust! (I've been learning Rust on-and-off since January. But I'm not productive in Rust yet!)
0
0
0
@kylebarron2 Yeah, I'm curious about that too. I'm really focused on the use-case of trying to read and decompress on the order of a million tiny (~4kB) chunks of files per second from a fast SSD. And I'm pretty sure that's _only_ possible using io_uring. But I could be wrong!
0
0
0
@kylebarron2 ...But maybe the correct way forwards is to submit a PR to object_store to (optionally) enable the use of io_uring to submit many `gets` at once. Like `get_many(list_of_1_million_files)`. Does that sound possible? 😀
1
0
1
@kylebarron2 In @OpenClimateFix, a lot of our data processing & ML training is done locally, on our own hardware. Storing a few hundred TB of data and running fast GPUs 24/7 gets *insanely* expensive in the cloud! Cloud costs for a few months are enough to buy & run your own hardware for yrs.
1
0
2