We at
@modal_labs
just released QuiLLMan:
An open-source chat app that lets you interface with Vicuna-13B using your voice. All serverless.
Fork the repo to deploy and start building your own LLM-based app in less than a minute!
The first time I tried Devin, it:
- navigated to the
@modal_labs
docs page I gave it
- learned how to install
- handed control to me to authenticate
- spun up a ComfyUI deployment
- interacted with it in the browser to run stuff
🤯
Today we're excited to introduce Devin, the first AI software engineer.
Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs on Upwork.
Devin is
Spend all your time writing Slack messages?
We're releasing DoppelBot, a Slack app that lets you fine-tune an LLM to answer messages like yourself.
Install the app now, or fork and host yourself:
100% serverless, running on
@modal_labs
.
1/ Meet devlooper.
🏖️ smol developer (by
@swyx
) with access to a
@modal_labs
sandbox so it can debug itself!
🔧 fixes codes *and* adds layer to the container image
📦 pre-made templates for React, Python, Rust
1/ We are releasing Playground v2.5, our latest foundation model to create images.
We tested our model across 20K+ users in a rigorous benchmark that went beyond anything we've seen to date.
This model is open weights. More information in the tweets below. 👇
Hooked up smol developer by
@swyx
to the new
@modal_labs
sandbox primitive.
The LLM can now see test output to recursively edit code *and* install deps in the image. Super satisfying to see it fix both code and tests until everything is ✅
The ability to connect to running Modal containers is game-changing.
You don't have to pick between great ergonomics and being able to look under the hood. You can have both!
You can now run interactive commands inside any running Modal container!
Just run `modal container list` and then `modal container exec [container-id] [command...]`
Docs:
Fun addition in the latest release of
@modal_labs
: tracebacks are preserved across remote function calls!
Small details like this are crucial to making cloud workers feel like an extension of your laptop, and sadly very few tools get them right
ByteDance presents SDXL-Lightning
a lightning fast 1024px text-to-image generation model
Progressive Adversarial Diffusion Distillation
model achieves new SOTA on one-step/few-steps generation
3 people built stable diffusion slackbots on top of
@modal_labs
within 24 hours of the model's release. So, here by popular demand, here's a tutorial on how you can do the same in <60 lines of code:
Please reach out if you would like an invite!
Not kidding. For the finals they’re running Minecraft-like agents in
@modal_labs
sandboxes, controlled by LLMs on
@modal_labs
H100s.
Would totally watch prompt olympics as an eSport.
As a part of our effort to replicate LLaMA in an open-source manner, we are pleased to announce the release of preview of the 7B OpenLLaMA model that has been trained with 200 billion tokens on the RedPajama dataset.
We (mostly
@jonobelotti_IO
) built a podcast transcriber on
@modal_labs
that spins up >100 containers running Whisper in parallel, and transcribes whatever you want in a couple mins:
Awesome tutorial by
@charles_irl
on training Dreambooth on your pet photos and deploying the model as an app on
@modal_labs
: .
Takes <10 minutes! (mostly training time)
My cat Puru graciously donated his latent space representation for some samples:
If you use Slack Connect for customer support and you're not using
@usepylon
yet, you're letting your customers down.
(Not affiliated with the company, just couldn't gatekeep it anymore)
From another lens, the crypto to AI migration isn't really that surprising. For the first time in history, the world has surplus compute, and we're still trying to figure out the best way to allocate it to produce value.
Today on the blog .... how we fight cryptominers abusing our GPU cloud using seccheck, a new security system that automatically detects and bans cryptominers. Here's a quick rundown of how it works:
@modal_labs
is amazing. i wrote a crawler. it fed into a pytorch transformer. added a modal decorator to an outer for-loop. now it farms to 30x machines. instantly. feels like a REPL; i'm in the zone. and i never wrote cloudformation crap. no clusters. no infra.
Part of the magic (and absurdity) of being at a fast-growing company is watching random variable names someone came up with in a hurry grow into essential company vocabulary, and in some cases concepts that entire teams are organized around
.
@dream_3_d
is awesome, and runs on
@modal_labs
.
Personally very excited about this because our first GPU examples (c. 2021) were generative art with VQGAN and rendering with Blender. Things have come a long way since then :)
Not talked about enough, but reliability and tail latencies are a big practical reason to move to self-hosted LLMs.
Turns out it's hard to build a production system around a black box API that just gives you tokens back eventually.
There’s no point in using GPT-3.5 in production — the API has frequent downtime, plus high response latencies lead to subpar end user experiences.
Instead, I’ve found a Mistral 7B finetune deployed with VLLM to be 10x faster(!), way cheaper, and *stupidly* simply to self-host.
Closely related is the joy of seeing a humble helper function grow into a fully featured service with its own engineering team and roadmap.
Watching people grow is great, but watching abstractions grow is beautiful in its own right.
Last weekend,
@jasshikoo
and I built : it’s like Substack, but for RSS feeds! Now we can follow all those great HN blogs that aren’t on Substack.
(Also, all of the backend is Clojure running Datomic Ions. More on that here: )
There's a fun bug sleuthing story here that involves wrangling strace output, reading through the CPython source for loading pyc files, and ultimately being betrayed by implicit type conversion. Will have to wait for a blog post some day.
Half the time people tell you something in tech is super hard it’s I think just weird lore? We (well, mostly
@akshat_b
) built our own file system despite everyone saying it’s an insane idea, and it was pretty straightforward and works great.
At some point, every technical-sounding term will be repurposed as a startup name; the overall namespace becoming a post-modern pastiche of empty signifiers.
Concretely, this means someone should probably go out and squat domain names for the top N words mentioned in ML papers.
Cornered resource for ML infra companies in 2022: "we found the one conference talk from 2019 that explains how to install xyz NVIDIA feature correctly"
@modal_labs
Ever since we deployed erik-bot internally, all our metrics have been going up and to the right.
However, we decided it would be irresponsible of us to open source its weights—it’s way too powerful! So, we’re releasing the next best thing.
It’s time for office hours at the University of Modal 🤓!
Bring your code, your questions, or just good AI banter to our NYC office next Thurs 6/13 from 4:30-6:30pm. we’ll have snacks, drinks, and Modal swag!
Finally read Snow Crash, and it's interesting how the 1992 "metaverse" is different from where we're headed today. E.g. in this world Google Earth and Wikipedia are stupid expensive. Guess it wasn't obvious that hyper-capitalism could make a bunch of nice things completely free?
Data labeling at scale is a complicated endeavor involving operations, machine learning, and game theory. Join us for a tech talk on Thursday (Aug 27) where we'll dive deep on the game theory of optimizing human-in-the-loop labeling. Register today!
TIL that visual hallucinations usually fall under four "form constants": lattices, cobwebs, tunnels and spirals.
Explanation seems to be that neural activity undergoes a polar transformation before becoming vision (so noise ends up having radial symmetry)
Super excited that this is out! We have 7 open source datasets up to play around with already.
It turns out MS COCO has a picture of a banquet where every fork, spoon and wineglass has been individually boxed.
@modal_labs
docs are searchable now! The crawler that updates our
@algolia
index itself runs on Modal, and takes <10 lines of code to set up and deploy:
When choosing between knock off brands turns into a deeper question about which defining anthropomorphic quality you want your paper towels to have. There are no right answers here.
Announcing Twitter '95, an AI simulation of Twitter, if it had existed in 1995.
- LLaMA 3.1 405B +
@FastAPI
on
@modal_labs
✅
-
@nextjs
app on
@vercel
✅
-
@PostgreSQL
on
@supabase
✅
- Jordan dunking on Clinton ✅
- MVP written in 26 hours at crossover hackathon/marathon ✅
We also have a script to generate terminal recordings that runs asciinema and feeds it input via a PTY.
Makes it easy to maintain recordings like this one:
We treat docs as code at
@modal_labs
:
* All code samples in docs are unit tested
* Autogenerate most tutorials from
code
* All examples run against prod 24/7 and monitored
* We execute all library docstrings as unit tests
* We treat deprecation warnings as exceptions
@debarghya_das
Looking at the code, that doesn't seem right? The rest of the code constrains the output to legal tokens at each point in the JSON structure.
With just a prompt it wouldn't work 100% of the time.
Saw a demo for a product that has Slack conversations on behalf of the user, and now I need to ask:
what's your goto shibboleth to catch bots? Should I prefix every message with `SolidGoldMagikarp`?
This week in Python land:
@jonobelotti_IO
found a package that goes into an infinite loop on `pip install` because it uses floating point math on version numbers 🙃
Also... there's a term in archaeology for studying prehistoric art that could have arisen from altered states of consciousnesses:
This has been a fun rabbit hole.
Puru woke us up in the middle of the night because his feeder’s servers went down. What a world.
Imagine getting paged for an on-call ticket where innocent pets will literally die unless you fix some AWS configs...
System Update: Our team is working closely with our third-party service provider in regards to the outage affecting the SmartFeeder (2nd Gen). We hope to release more information as we learn more. We apologize for this inconvenience.
I wonder if it's a good thing that Gödel's theorem is so widely misunderstood. If more people saw it as the specific result about formal systems it actually is, would there be net less new ideas in the world?
Nietzsche points out that "every concept arises from the equation of unequal things" — the result being that equality (and hence truth) in language is not like mathematical truth, and all our notions of truth are built upon what are fundamentally lies
Need to print a shipping label for an Amazon return, but don't have a printer. Current plan is to use the
@lob
API to deliver myself a single print out, so I can then send more mail.
@danlovesproofs
@AirplaneDev
Problem intimacy is a great way to put it—also why so much of the "moat" is actually the knowledge and experience of the team itself
@swyx
@transitive_bs
It seems like smol-dev should probably not modify its own runtime environment right? (i.e. the code it produces should run in a separate sandbox?)
So what it could do is output Python code + dependencies, and run those as a separate Modal function.
@aman_kishore_
@jiajihml
@modal_labs
@raydistributed
Good qn
@jiajihml
! We're focusing on providing a magical dev exp out of the box. No configs or maintenance overhead. We built our own container engine in Rust for fast cold-start, so iterating really feels like it's local.
A few more things that I'd love to show you—DMing :)
"# 1 Best Seller in Budget Travel Guides"
Can't tell if bug, marketing tactic or dastardly plot to introduce budget travelers to hipster views on metaphysics.
@levelsio
Run it on
@modal_labs
:
Cold start time is 10s for this model, even if you make any modifications to the code. In practice can adjust idle timeout and send a "noop" request a few seconds ahead to not hit cold start.
@modal_labs
@algolia
Having to actively hold ourselves back from running more of Modal on Modal itself. Pretty much the opposite of the problem some companies have with not enough product dogfooding.