Jack Kelly (@jack-kelly.com on Bluesky) @jack_kelly profile

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

Followers

5K

Following

13K

Statuses

7K

Trying to help mitigate climate change using machine learning. Co-founder of @OpenClimateFix. Father. Previously @DeepMind & consulted for @NationalGridESO.

Peckham, London SE15

Joined August 2008

Don't wanna be here? Send us removal request.

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

3 months

I've moved to Bluesky:

0

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

8 months

RT @Brianna_R_Pagan: Folks impacted by the recent @planet layoffs please reach out, we have 2 upcoming software engineering positions openi…

0

21

0

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

8 months

RT @EnergySystemsIG: 🚨UPCOMING WEBINAR🚨 "How NOAA Open Data Dissemination Weather Forecasts Are Boosting Grid Operations" Featuring Adrie…

0

2

0

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

1 year

Micron recently presented their new "NVDRAM": byte-addressable, non-volatile, high-performance storage based on FeRAM In short: I'm excited! It could allow for high performance random access to multi-dim arrays. But it might never make it into a product.

0

9

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

1 year

RT @OpenClimateFix: A really interesting #podcast by @CarbonCopy_Pod! Listen to the podcast to learn more about @jack_kelly's idea behind…

0

3

0

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

1 year

Here's some detailed benchmarking I did yesterday of a fast SSD (Seagate FireCuda 530, PCIe Gen4). The focus of the benchmarking is to look at how fast io_uring can go for small, random reads. The ultimate aim is to help build a faster Zarr reader in Rust

0

2

9

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

2 years

@Stu3b3 Thanks so much for describing your use-case, David! Yeah, it is exciting times! To set expectations: it probably will take a while (many months) before we see these performance improvements make their way into stable releases. But it is very exciting, none-the-less 😀.Thx again!

0

1

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

2 years

@thundercloudvol Yeah, great points. I'm expecting Python will get in the way a lot! So my current plan is to write a complete Zarr implementation in Rust (with Python bindings). Although the Rust implementation may actually involve tweaking some existing crates, rather than starting from scratch

0

2

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

2 years

@al_merose @martin_durant_ Yeah, me too! I've been chatting to Jeremy (TensorStore dev). One of my first tasks in Sept will be to benchmark TensorStore & Zarr-Python. My hunch is that TensorStore is a lot faster than Zarr-Python. But may start to struggle when we're trying to load close to 1M chunks/sec

0

1

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

2 years

@kylebarron2 Hey @kylebarron2, if you're interested: We're planning to start holding regular half-hour meetings to discuss speeding up Zarr (possibly using Rust). Your perspectives in the meetings would be super-useful😀 (but no pressure if you're too busy!) Details:

0

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

2 years

@martin_durant_ You know infinitely more than I do about async Python & Zarr & Rust. I'm still a few months away from being vaguely productive in Rust. So I'd hugely value your guidance! Although I appreciate you're super-busy, so even an occasional chat would be great!

0

1

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

2 years

@kylebarron2 @OpenClimateFix Yeah, our issue is that our two largest datasets don't exist in public cloud storage buckets. We FTP our NWP datasets from the Met Office! If these datasets existing in public cloud buckets, in performant data structures then that might change the equation!

0

1

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

2 years

@kylebarron2 Hehe, yeah, I still have a lot of Rust to learn! So it'll probably take me a few months to really get productive with Rust! (I've been learning Rust on-and-off since January. But I'm not productive in Rust yet!)

0

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

2 years

@kylebarron2 Yeah, I'm curious about that too. I'm really focused on the use-case of trying to read and decompress on the order of a million tiny (~4kB) chunks of files per second from a fast SSD. And I'm pretty sure that's _only_ possible using io_uring. But I could be wrong!

0

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

2 years

@kylebarron2 BTW, thanks so much for all the super-useful suggestions! I've learnt a lot!

0

1

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

2 years

@kylebarron2 ...But maybe the correct way forwards is to submit a PR to object_store to (optionally) enable the use of io_uring to submit many `gets` at once. Like `get_many(list_of_1_million_files)`. Does that sound possible? 😀

1

0

1

Jack Kelly (@jack-kelly.com on Bluesky)

@jack_kelly

2 years

@kylebarron2 In @OpenClimateFix, a lot of our data processing & ML training is done locally, on our own hardware. Storing a few hundred TB of data and running fast GPUs 24/7 gets *insanely* expensive in the cloud! Cloud costs for a few months are enough to buy & run your own hardware for yrs.

1

0

2