Hi I’m Eric. I’m 19.
I immigrated to the US 14 years ago. Lived in the NYC projects for 11 years.
This summer, I’ll be going to Silicon Valley to work at
@zoox
on driving tools for autonomous vehicles.
Today was my first day :)
Today I generated 3d from text.
- Trained diffusion transformer on sunset of ModelNet10
- Got tensorboard working
Kinda blows my mind this even works.
I’ll add a lr scheduler (cosine annealing with warmup) and send it on the full dataset
Doing a deep dive into diffusion models. I’ll start by diffusing sine and cosine waves then work my way to building a text to 3d model from scratch.
Documenting everything in a series of jupyter notebooks. Let’s go.
Inspired by
@MajmudarAdam
Implemented EMA for VQVAE
This improved training stability a ton. For the version w/o EMA, loss spike in the beginning and would gradually increase in the middle of the run. That doesn't happen anymore.
i learned about RNNs and implemented one from scratch in numpy.
forward pass, cross entropy loss, backprop thr. time, gradient clipping, gradient descent, and sampling.
repo:
more details below
My FIRST time training a text-conditional model.
This is a tiny diffusion transformer that generates sine / cosine waves from text.
Kinda magically this works. Will get this working on 3d soon.
Here’s what I’ve been hacking on — a tiny text-to-3d diffusion model. This was a challenge to learn diffusion and build something cool.
Some notes and some demos :)
Prompt: "chair"
Kicked off VQVAE run
- loss is unstable in the beginning, probs cuz it's trying to best use the codebook
- perplexity going up
- recon on test set looks pretty good
Just ran into a segmentation fault in C++. Ended up being an issue w/ dangling pointers.
How in the world do u guys even debug this? No stack trace. No nothing. Feels like driving blind.
1. Starting my journey to learning ai. Reading a lot of papers, writing and training models, and one day be at the cutting edge. From scratch and first principles.
Will I suck? Most definitely. I just hope to suck less with every passing day.
Inspired by
@Suhail
and
@karpathy
.
Render some cylinders rn
Once again, high school math of solving the quadratic formula comes in handy. Computer graphics just makes so much sense intuitively.
1. Hollow cylinders
Found a bug in the VQVAE where decoder was not expressive enough. I passed the wrong channel number.
Anyhow, the quality reconstructed images improved a ton after and interpolations look pretty sick.
ok I figured out why my diffusion model was doing ass during the big training
during validation, the guidance scale is set to 1. not enough conditioning strength. I upped it to 2.5 and it works.
Self-teaching myself AI by reading the original papers and training neural nets from scratch.
It's hard and painstakingly difficult. But the few days when I get something to work, it's absolutely magical. This rather "trivial" result by today's standards still blows me away.
This the year I become a consistent reader.
Smart people have written their life learnings and experiences in a form absorbable in a couple hours. Why haven’t I gotten started absorbing?
Here’s to reading my first book of 2024.
@0xluffyb
For the longest time, I associated research with academia which I did not want to do.
Then I found out it's just reading, implementing stuff, and making changes to what you've built.
This is fun and not scary.
today i moved to SF. people come here to build the future. i guess i’m just one of them.
ai is what i will work on. it’s unbelievably exciting. more than ever it’s time to build
i hope to make lifelong friends here. anybody want to meet :)
My learnings from Imagenet Classification with Deep Convolutional Neural Networks.
1. Architecture: 5 convolutional layers followed by 3 feedforward layers and a softmax
2. CNNs are faster to train b/c of fewer params and connections
(Cont.)
Kicking off text-to-3d training on modelnet40.
153838596 param DiT (most of this is CLIP lol)
100k steps
lr=1e-3
cosine annealing w/ 500 warmup steps
batch_size=32
cosine noise schedule
Giving it some pretty out of domain prompts to validate against.
Training VQVAE2, a hierarchy of VQVAEs. The top level latent captures global info and the bottom level latent captures local details.
Couple steps into training and the recon quality is amazing.
There's a big difference b/w what I know and what I think I should know. Especially in AI.
Gonna religiously dedicate time to becoming good.
Focus this week. Transformers.
4. Reviewed backpropagation with this awesome video from
@3blue1brown
High level idea is how a tiny nudge in a neuron's weights and biases affect the network's output. We calculate this with the chain rule.
@sam_postelnik
Sam marketed the hell of out of this. Big reason y we got to 5000+ users.
Anybody looking for a product person who can code,
@sam_postelnik
is the person :)
Sitting down for an extended period of time to deeply understand a concept is sooo invaluable.
Been too accustomed to FAST and not struggling.
A sign I need to embrace doing hard things more often.
After banging my head for a week, I finally got mini AlexNet (what I called my scaled-down version) to output something “intelligent” on the test set.
Moving on to creating a web interface to further test how this model performs on unseen data.
Listening to
@naval
podcast on “How to Get Rich” every morning on my commute.
Learning foundations (ai, math) and becoming the type of person to attract luck (twitter, making friends with interesting ppl … etc) is gonna be the core focus of my summer.
@KrishivThakuria
Had a conversation about this today. Once you realize u can do anything, it’s quite empowering.
Takes a bit of rewiring to see this. Time to build :)
@sam_postelnik
@JosephKChoi
Love this. Andreessen’s perspective on career as a portfolio of jobs/opportunities is gold.
Makes u think more long term and willing to take risks
Hey
@scale_AI
I applied to the July 15 GenAI hackathon and this is why you shouldn’t accept me.
I used Gen AI for evil. Created a lil ai product
@sam_postelnik
called FGenEds (short for “fuck gen eds”) to help college students study less.
The entire field of probability stems from 3 axioms. A shit load of math from JUST 3 axioms.
What if the key to building a shit load of intelligence into LLM agents is simply just finding the right set of primitives (tools)?
Created my first "hello-world" agent from scratch to answer questions about the weather. Such a simple and trivial use case but I'm fucking blown away 🤯.
Excited to continue building
Lots of help from the ReACT paper
Implemented YOLOv1's loss function in Pytorch.
Recurring lesson for me: everything's fucking hard until you do it :)
Moving on to writing the model, training it, and figuring out whatever non-maximum suppression, MAP, and blah blah is
Reviewing every god damn thing I learnt in ML
Will also need to go back to studying foundations - stats, probability, linear algebra
All in on AI. Day 1
1. Denoising Diffusion Probabilistic Models
Noise an image and train a network to "denoise" it.
Questions after reading this
- where did the math come from?
- where did the math come from?
- where did the math come from?
#include
is pretty much like an import statement.
Need to figure out what a header file is.
Will report back after I get my first Raytraced scene.
Onto implementing matrix operations.