A bunch of top ML folks from Google, DeepMind, OpenAI, etc have come together to build Adept! It’s a pleasure to be working with this kind and extremely talent-dense crew, incl. the folks who invented Transformer.
We’re doing something a bit different… (thread)
This is tech’s “let them eat cake” moment: I only see talk of machine learning papers, WFH setups, and the escapist fantasy of space.
Wake up, everybody. AI doesn’t matter if we can’t even treat everyone in this country like a real human being.
After some super restorative time off (both work-wise and twitter-wise), I'm excited to join Google Research! I'm starting a new group focused on large, multiyear DL projects with fundamental research goals... very cool to get to work with folks like
@JeffDean
!
Excited to share the news that we’ve raised $350M to build a natural language interface to your computer!
Having a strong coalition of strategic partners for Adept (Atlassian, Microsoft, NVIDIA, and Workday) is going to be awesome!
We’re also excited to partner with Addition, Greylock, Atlassian Ventures, Microsoft, NVIDIA, Workday Ventures, Caterina Fake, Frontiers Capital, PSP Growth, SV Angel and A_Capital, and others, who supported the round.
More from
@Forbes
below:
(2/4)
People have invested a ton of time and expertise to create software tools that help them get their work done. Rather than replacing these tools, we want to build a natural language interface to all of them — an NL frontend to your computer.
This is also the reverse of most AGI work out there. Rather than automating economically valuable tasks, we want to keep humans in the driver’s seat, by building AI tools that people can work with to do things together.
It’s just been three months since we got started and we’ve already built a ton. If the idea of building a foundational general AI product – and using it to solve general intelligence – excites you, please reach out :)
I had a wonderful 1.25 years at Google Research! It was a real privilege to get to lead the large models effort there and work with folks like
@RandomlyWalking
,
@achowdhery
,
@elicollins
and
@JeffDean
on PaLM etc. After OpenAI and Google, I’ll be doing something totally different!
Finally out!
AI is as much about engineering as it is about research. PaLM required solving hard problems across all levels of the stack—networking, XLA, distributed training infra, optimizers, model architecture, data.
Our group’s model scaling effort did whatever it took.
Introducing the 540 billion parameter Pathways Language Model. Trained on two Cloud
#TPU
v4 pods, it achieves state-of-the-art performance on benchmarks and shows exciting capabilities like mathematical reasoning, code writing, and even explaining jokes.
Solving Rubik's Cube with a humanoid hand shows my favorite part of
@OpenAI
's research philosophy: choose a hard task that we don't think is doable with today's techniques, then use or invent whatever technique to solve it. This is the transpose of how research is often done.
We’re all used to robots that fail when their environment changes unpredictably. Our robotic system is adaptable enough to handle unexpected situations not seen during training, such as being prodded by a stuffed giraffe:
In the future, we’ll be able to ask our computers to do increasingly abstract and complex things in natural language—and it’ll be the default way people use their machines.
Excited to share our first step in this direction! Some thoughts on why I think this is really cool:
1/7 We built a new model! It’s called Action Transformer (ACT-1) and we taught it to use a bunch of software tools. In this first video, the user simply types a high-level request and ACT-1 does the rest. Read on to see more examples ⬇️
Give me an artist, genre, and lyrics (or not), and this neural network will generate you a song! It can even rap.
This is one of the coolest results from OpenAI’s Algorithms and Language team. Loved being involved in it as a bureaucrat.
Next up, 24/7 lo-fi vaporwave.
Introducing Jukebox, a neural net that generates music, including rudimentary singing, as raw audio in a variety of genres and artist styles. We're releasing a tool for everyone to explore the generated samples, as well as the model and code:
Making AI agents reliable has been a big challenge for the community, in part because they can’t see. GPT-V/Gemini are not yet generally available.
Adept’s releasing an open 8B multimodal model! Fast, good at regular photos, but also handles unstructured knowledge worker data.
It actually has an extremely simple architecture. Fuyu-8B doesn’t have an image encoder. This allows easy interleaving of text and images and handling arbitrary image resolutions! And it’s super fast for copilot use cases where latency really matters.
Congrats to the OpenAI, Microsoft, and GitHub teams on this pretty sweet set of results! Program synthesis is going to get really lit in the next few years, thanks in large part to scale.
Today,
@GitHub
,
@OpenAI
and
@Microsoft
launched a technical preview of GitHub Copilot. It’s a great example of how advancements in
#AI
are producing powerful new tools to help developers write better code - and spur more creativity and innovation.
Adept's Fuyu architecture scales really well! Fuyu-Heavy is the best model in its weight class and outperforms Gemini Pro :)
In particular, it's super good at being an AI agent on your computer...
Introducing Fuyu-Heavy, our new multimodal model. Fuyu-Heavy is the world’s third-most-capable multimodal model, behind only GPT4-V and Gemini Ultra, which are 10-20 times larger. In particular, it outperforms Gemini Pro at both MMLU and MMMU...
We’ve been training giant neural networks that do stuff for you on your computer!
In the first three months at Adept, we taught it to query databases, make visualizations, and fetch data from the web, but we want to teach it how to use every software tool in the world.
We made a fun video of some of the earliest things our system can do! If you want to help us build useful general intelligence, please reach out -- we are hiring.
The improvement in the quality of images sampled from generative models over the last few years has been astounding.
See in particular 5:35. This GAN has learned 3d structure of cars and bedrooms through only 2d images, and interpolates smoothly!
My lovely motorcycle (and main commuting tool) was stolen in front of my house at 6:30AM today. It’s a 2018 KTM 500 EXC with crazy graphics that the previous owner put on. SF friends, keep an eye out for me! Living in this city has not been a good time. Nest shots attached.
Excited that Overcooked-like environments are being used to study cooperative multiagent RL!
I’m happy that the growth of indie gaming has led to new creative gameplay mechanics, which makes our jobs as researchers easier in that useful environments sometimes come to us :)
Excited to share our work: collaboration requires understanding! In Overcooked, self-play doesn't gel with humans: it expects them to play like itself. (1/4)
Demo:
Blog:
Paper:
Code:
I know this headline "OpenAI Wants to Move Slow and Not Break Anything" is written tongue-in-cheek, but I could not agree more.
ML research has real downstream consequences. The Silicon Valley cowboy mentality is irresponsible for ML.
Turns out training high-capacity language models on chunks of the internet produces a flexible, general tool for language understanding tasks! I was surprised by how many tasks it had learned to do with no task-specific data, setting SOTAs on some and solid performance on others.
We've trained an unsupervised language model that can generate coherent paragraphs and perform rudimentary reading comprehension, machine translation, question answering, and summarization — all without task-specific training:
Returning from a long hiatus to say -- this was an exceptionally fun podcast thanks to
@HarryStebbings
!
Give it a watch for a collection of hot takes on path to AGI, limitations, hardware-model vertical integration, and the crucial role that interaction design now plays...
3. Why Every Cloud Provider Must Have a Model Play
As models become smarter, they’ll become the base computing primitive.
The logic of software will be handled by LLMs in the future.
Whoever controls the model layer controls all of the underlying compute.
Belated commentary:
AI research is extremely expensive, and this partnership provides a level of stability where OpenAI can comfortably take an even longer-term view with the research problems we pursue, and the diff ways our policy work engages with society.
Microsoft is investing $1 billion in OpenAI, the research lab overseen by startup guru Sam Altman that says (with all seriousness) that it wants to build "artificial general intelligence, or AGI, a machine that can do anything the human brain can do:
Congrats to
@GreylockVC
on their new early-stage fund! I've been grateful for their support since day 0 for Adept... they've done a bunch for us, including helping us hire some of our strongest folks on the team.
1/ With a long history of partnering w/ founders from idea to IPO, we
@GreylockVC
are excited to announce 2 new updates to advance this mission:
1/ Fund 17, our new $1B early-stage fund
2/ Edge, a bespoke program to help founders initiate new companies
I think it’s time for more evals for multimodal models that capture what we actually care about downstream… not sure there’s much more to gain by hillclimbing what’s out there right now!
Finally, it’s still the early days, but we think the future of these interfaces will be less like an assistant (you tell the model to do stuff), and more like a collaborator (you and the model work together to solve a problem).
A true bicycle for the mind!
OtterHD: A High-Resolution Multi-modality Model
Presents OtterHD-8B, an innovative multimodal model evolved from Fuyu-8B, specifically engineered to interpret high-resolution visual inputs with granular precision
CALL FOR TASKS CAPTURING LIMITATIONS OF LARGE LANGUAGE MODELS
We are soliciting contributions of tasks to a *collaborative* benchmark designed to measure and extrapolate the capabilities and limitations of large language models. Submit tasks at
#BIGbench
I’ve been working with
@AdeptAILabs
and we’ve made FlashAttention even faster for long sequences! For seqlen 8K, FlashAttention is now up to 2.7x faster than a standard PyTorch implementation even at small batch, making it easier to train better LMs with longer context 1/7
ML is increasingly as much about new ideas as it is about big engineering investments. Standardizing frameworks lets us share a lot more across teams. Excited to be leading this transition with our head of infra, Chris Berner! Keep an eye out for blocksparse wrappers for PyTorch.
We're standardizing OpenAI's deep learning framework on PyTorch to increase our research productivity at scale on GPUs (and have just released a PyTorch version of Spinning Up in Deep RL):
bleep bloop!
We’re hiring our first designer at Adept. We’re building a teammate anyone can work with to get stuff done in front of a computer.
If you’re interested in defining how people and increasingly capable AI systems interact, we’d love to chat!
We’re hiring Adept’s first designer! If you want to shape the new era of human/machine interaction, apply at . Bonus points if you have experience with interactive and/or multimodal ML products.
At NAACL last week we built a new side project, Write With Transformer.
It lets you trigger GPT-2 completions multiple times, in a Google Doc-like interface.
🦄 It's like having a unicorn friend that completes your thoughts 🦄 cc
@gdb
@AlecRad
Try it:
It’s irresponsible for Amazon to push their facial recognition product onto law enforcement, especially with how much higher the error rates are for POC and women. The price of mistakes is paid not by Amazon, but by minorities and the over-policed.
Interesting analysis on some of the upcoming research challenges we’ll have to solve at Adept together with the research community! There’s so much still to do.
This is the dream: having a system whose action space is universal (at least in the world of bits). And with foundation models, it is actually possible now to produce sane predictions in that huge action space. Some interesting challenges:
@kane
wow. wish i had your naming skills when these were being worked on!
I tried to name my new group at Google “Galaxy Brain” and then “Big Brain,” but neither stuck.
Every enterprise service I use regularly now has a logo indistinguishable from a Google Cloud product. It is so confusing.
Now, in a fancy SoHo store, tshirts that could pass as swag for a Google Cloud product. Have high fashion and high tech converged at last?!
I'm thrilled to announce that I will be joining the superb team at
@OpenAI
in June, where I will be starting a group (and indeed hiring) focused on achieving open-endedness in machine learning. Looking forward to exploring a novel path!
Jeff's vision for the emergent complexity that comes from generating environments is incredible, and in my opinion, a necessary piece towards advancing intelligence. I'm so happy to welcome him to the team!
I am extremely excited to announce (1) I've joined OpenAI to lead a large-scale effort into AI-generating Algorithms research, & (2) I'll be an Associate CS Professor at U. British Columbia in 2021, where I will continue to lead the OpenAI project. Both are dreams come true! 1/2
1/7 We built a new model! It’s called Action Transformer (ACT-1) and we taught it to use a bunch of software tools. In this first video, the user simply types a high-level request and ACT-1 does the rest. Read on to see more examples ⬇️
Neat survey from
@CadeMetz
of the next frontier of progress toward useful AGI—agents that can do more than just talk, but actually use software to do stuff on your computer.
Featuring some of Adept’s early work and some great directions from my former colleague
@jeffclune
!
Today our research is on the homepage of the New York Times, covering VPT and implications for AI like it (eg
@jluan
's
@AdeptAILabs
), & great work by
@DrJimFan
&
@AnimaAnandkumar
).As a fun bonus, they also included my children's doodles! Thanks
@CadeMetz
!
I'm very hopeful that we can add Activation Atlases to the arsenal of tools people use to assess for bias and spurious correlations in machine learning.
For the last few years, one of the staples of my research has been visualizing individual neurons in vision models. But that's only a partial picture -- neurons work together.
Activation Atlases are a way to explore the space neurons jointly represent.
L👈: "A Koala bear in a suit standing at a podium to teach. Variational bayesian methods is written on the chalkboard. There are lot of confused cats in the crowd"
R 👉:"Variational bayesian methods is all you need is written on the chalkboard."
🐨🙀
#imagen
#googleai
#brain
As I was working on making the partnership happen, the thing that most struck me was the degree to which MSFT leadership cared about OpenAI's independence and the importance of our nonprofit board.
Really surprised and happy to receive an Honorable Mention for Outstanding Paper on "Generative Pretraining from Pixels" ()! Thanks so much to the ICML awards committee!
Went to the end of Dreamforce in a ghillie suit and camouflaged by the various fake nature displays, pretending to be an artificial shrub after the event ended.
I hid easily, but security took notice when I stood near the line for champagne.
#DF19
Folks at OpenAI gathered around to try basic facts, like "Q: what's the tallest mountain on Earth?" which it answered correctly. And it would produce surprisingly strong samples, like an essay on recycling.
And then we tried English to French translation on a whim. 😮
ACT-1 is also surprisingly sample efficient when it comes to human feedback. A single demonstration or piece of documentation can be all that’s needed to do something new.
We think human feedback is by far the best path to improve capabilities and alignment.
6/7 ACT-1 doesn’t know how to do everything, but it’s highly coachable. With 1 piece of human feedback, it can correct mistakes, becoming more useful with each interaction.
My friend has been making
#MachineLearning
books for *5 year olds* and they are freaking adorable - - - and actually educational. Asian parents would go crazy with these books
#rocketbabyclub
@dennybritz
We do at OpenAI--we have toy tasks that are meant to encourage advances in planning, etc. At the same time, we also work on solving complex environments that require the composition of problems to make sure we're not overfitting on the toy ones.
ACT-1 maps natural language to actions. All of us already know that natural language input is extremely flexible—we’ve seen it with language models. The surprise is that our model, ACT-1, is similarly flexible with its outputs, aka what software tools it can use.
3/7 Working in-depth in tools like spreadsheets, ACT-1 demonstrates real-world knowledge, infers what we mean from context, and can help us do things we may not even know how to do.
Want to improve accuracy and robustness of your model? Use unlabeled data!
Our new work uses self-training on unlabeled data to achieve 87.4% top-1 on ImageNet, 1% better than SOTA. Huge gains are seen on harder benchmarks (ImageNet-A, C and P).
Link:
👋 I'm excited to unveil
@airstreet
’s second fund of $121,212,121 as we accelerate our mission to back ambitious AI-first companies in North America and Europe!
🧵 My reflections on the journey, opportunity and what this means for our founders and community:
@mer__edith
at yale, there were nice dorms and financial aid dorms. financial aid dorms used to not have dining halls b/c their students worked as servants in the halls of the rich kid dorms. most full ride kids incl me were drafted to the financial aid dorms bc we don’t benefit from legacy.
On top of the great research progress, these incredible photorealistic samples point to the need for society to quickly adapt to fake imagery and videos.
If your loss curves look sus, join the club! Giant LLM training runs are full of pitfalls. We learned the hard way. We wrote a deep dive for the community on silent data corruptions (SDCs).
Problem and mitigations here:
It was the first time I'd seen a model resemble a competent generalist. As initial results rolled in, we started trying things we thought there would be no way it would have learned zero-shot: article summarization, question answering. The model gave reasonable answers!
OpenAI doubled down on its no-muppet strategy with a follow up in February, GPT-2. It generates impressively fluid text. (See: ). China's Tsinghua University chose to sustain the trend, with Enhanced Language Representation with Informative Entities: ERNIE.