jack morris Profile Banner
jack morris Profile
jack morris

@jxmnop

Followers
12,072
Following
825
Media
388
Statuses
3,353

getting my phd in nlp @cornell_tech 🚠 // academic optimist // tweeting from the snack aisle at trader joes

San Francisco, CA
Joined October 2016
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@jxmnop
jack morris
2 months
yearly reminder everything looks exponential from the middle of a sigmoid
Tweet media one
69
378
5K
@jxmnop
jack morris
7 months
now seems as good a time as ever to remind people that the biggest breakthroughs at OpenAI came from a previously unknown researcher with a bachelors degree from olin college of engineering
Tweet media one
Tweet media two
Tweet media three
Tweet media four
63
530
6K
@jxmnop
jack morris
2 years
apparently deep learning models can tell male from female eyeballs with 87% accuracy but no one knows why: "Clinicians are currently unaware of distinct retinal feature variations between males and females, highlighting the importance of model explainability for this task"
Tweet media one
101
700
5K
@jxmnop
jack morris
5 months
today i found out that this one australian guy has been toiling away making incredibly detailed Neural Circuit Diagrams with the vibe of a 1950s issue of Popular Mechanics, but content fit for the 2020s behold. the Transformer
Tweet media one
24
645
5K
@jxmnop
jack morris
11 days
harvey dent Was right
Tweet media one
@aaronpholmes
aaron holmes
12 days
New: Sam Altman has told shareholders that OpenAI is considering becoming a for-profit company that would no longer be controlled by a nonprofit board
352
369
3K
140
579
5K
@jxmnop
jack morris
4 months
2022: - oh nooo!!! you can't run language models on cpu! you need an expensive nvidia GPU and special CUDA kernels and– - *one bulgarian alpha chad sits down and writes some c++ code to run LLMs on cpu* - code works fine (don't need a GPU), becomes llama.cpp 2023: - oh noo!!
71
332
4K
@jxmnop
jack morris
6 months
how I got my first job at google > be me > go to college > great state school, smallish CS program > google does not recruit here > that’s ok > no worries > build personal website > start working on cool blog post for site > write some python code for blog > confused about
68
101
4K
@jxmnop
jack morris
17 days
*breathes deeply* ahhhh yes... the boundary of all human knowledge
Tweet media one
19
277
4K
@jxmnop
jack morris
3 months
navigated to a file to write a helper function, but the code was already there, written by me, exactly where i was about to write it
Tweet media one
9
126
3K
@jxmnop
jack morris
2 months
one of the most important things I know about deep learning I learned from this paper: "Pretraining Without Attention" this what I found so surprising: these people developed an architecture very different from Transformers called BiGS, spent months and months optimizing it and
Tweet media one
94
420
3K
@jxmnop
jack morris
3 months
what software was this made with? i don't think you can draw arrows that curve like that w/ Google Drawings
Tweet media one
212
69
3K
@jxmnop
jack morris
3 months
okay what 99.99% chance there's a bug in my code, but 0.01% chance i just solved text retrieval
Tweet media one
65
34
2K
@jxmnop
jack morris
16 days
here is my meticulously curated (and highly biased) summer paper reading list 📚: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (2020) ╰╴ LoRA: Low-Rank Adaptation of Large Language Models (2021) ╰╴ Ring
@jxmnop
jack morris
17 days
*breathes deeply* ahhhh yes... the boundary of all human knowledge
Tweet media one
19
277
4K
28
269
2K
@jxmnop
jack morris
10 months
An amazing mystery of machine learning right now is that state-of-the-art vision models are ~2B parameters (8 gigabytes) while our best text models are ~200B parameters (800 gb) why could this be? philosophically, are images inherently less complicated than text? (no right?)
372
120
2K
@jxmnop
jack morris
4 months
> be google researcher > exec says we need to beat mistral 7b > “reclaim the narrative" > train 7b model > run eval. check plots > cant do it. maybe with 8b > exec says no dice > “we'll look like chumps" > ok > maybe we can tie embeddings and linear_f, save some params > not
Tweet media one
24
93
2K
@jxmnop
jack morris
3 months
New Research: a lot of talk today about "what happens" inside a language model, since they spend the exact same amount of compute on each token, regardless of difficulty. we touch on this question on our new theory paper, Do Language Models Plan for Future Tokens?
Tweet media one
21
142
1K
@jxmnop
jack morris
9 months
finally published our latest research on text embeddings! TLDR: Vector databases are NOT safe. 😳 Text embeddings can be inverted. We can do this exactly for sentence-length inputs and get very close with paragraphs...
Tweet media one
44
181
1K
@jxmnop
jack morris
8 months
tired of paying OpenAI for GPT-4 API? the NYC Department of Small Business Services has your back! the NYC small business chatbot is powered by GPT-4, so equally capable just have to ask it information about operating a business in New York City first
Tweet media one
Tweet media two
Tweet media three
31
105
1K
@jxmnop
jack morris
5 months
when my code is getting too slow, i just run it a couple times and ctrl-C on the slow part, see where in my code it stops me. i call it the poor man's profiler
45
46
1K
@jxmnop
jack morris
7 months
say what you will about mistral, tweeting exclusively download links to new models with no context is unbelievably cool
Tweet media one
20
61
1K
@jxmnop
jack morris
7 months
Honestly the Gemini release made me really sad. “Gemini beats GPT-4 at 32-shot COT” (read: not straight up) is exactly what a PhD student would say if their model wasn’t as good as they’d hoped what’s so special about GPT-4? is it people? systems? data? some kind of blind luck?
76
30
1K
@jxmnop
jack morris
10 months
curious if anyone knows where Google went wrong with TensorFlow? it's bad software, fundamentally broken. when I was an AI resident I found ~5 bugs within tensorflow core in around a year. but how does a failure like this this happen? so many smart people work there.
Tweet media one
140
58
1K
@jxmnop
jack morris
10 months
unpopular opinion: training open-source LLMs is a losing battle. a complete dead end the gap between closed models like GPT-4 and open models like LLAMA will only continue to widen as models grow bigger and require more resources no one is building particle colliders at home
257
55
950
@jxmnop
jack morris
6 months
ten years of deep learning research, a summary
Tweet media one
10
83
941
@jxmnop
jack morris
7 months
fewer than 100 people deeply understand both (i) transformers and (ii) the GPU programming model want to learn machine learning? gain some esoteric systems knowledge; spend some time really learning CUDA
60
26
923
@jxmnop
jack morris
5 months
class tonight. send me your most scrutable visualizations of Transformers
Tweet media one
Tweet media two
Tweet media three
Tweet media four
40
64
885
@jxmnop
jack morris
4 years
Introducing TextAttack: a Python framework for adversarial attacks, data augmentation, and model training in NLP. Train ***any @huggingface transformer*** (BERT, RoBERTa, etc) on ***any @huggingface nlp classification/regression dataset*** in a single command.
Tweet media one
Tweet media two
12
213
875
@jxmnop
jack morris
1 year
things always look exponential when you’re standing in the middle of a sigmoid
Tweet media one
30
73
848
@jxmnop
jack morris
2 years
one day the history books will write about the role this man played in the development of AGI
Tweet media one
13
64
845
@jxmnop
jack morris
6 months
Seen a lot of evidence that GPT-4 crushes Gemini on all the head-to-head LM benchmarks. here's my theory about what went wrong: - chatGPT released - google execs freak out - google consolidates (integrates Deepmind, Brain) - google assembles a giant team to build a single giant
Tweet media one
45
41
804
@jxmnop
jack morris
8 months
I turned down a job at google research to do a PhD at Cornell right before chatGPT came out and I don’t regret it at all. I see it like this. Do you want to work with a large group on building the fastest & fanciest system in the world, or in a small group testing crazy theories
@sshkhr16
Shashank Shekhar
8 months
As PhD applications season draws closer, I have an alternative suggestion for people starting their careers in artificial intelligence/machine learning: Don't Do A PhD in Machine Learning ❌ (or, at least, not right now) 1/4 🧵
36
55
514
13
26
796
@jxmnop
jack morris
6 months
fun research idea: Latent chain-of-thought / Latent scratchpad it's well-known that language models perform better when they generate intermediate reasoning tokens through some sort of 'scratchpad'. but there's no reason scratchpad tokens need to be human-readable. in fact,
Tweet media one
52
73
797
@jxmnop
jack morris
2 months
move over meta, the true biggest benefactor of open source machine learning is CHANEL
@aj_kourabi
AJ
2 months
TIL scikit-learn, an open-source ML library, has only one Platinum sponsor and it is ... Chanel?
Tweet media one
41
457
6K
5
41
792
@jxmnop
jack morris
4 months
if i won the lottery, i wouldn't tell anybody but there would be signs
Tweet media one
Tweet media two
19
49
771
@jxmnop
jack morris
5 months
you’re a real NLP person if you remember that Hugging Face started as an AI girlfriend chatbot company and huggingface/transformers used to be called pytorch-pretrained-bert
22
35
739
@jxmnop
jack morris
5 months
if you find yourself spending too much time deliberating what to name your company, just remember there's a $4.5 BILLION dollar startup with the name "Hugging Face"
20
45
718
@jxmnop
jack morris
4 months
people spent years optimizing GANs before realizing that diffusion models were simpler and better people spent years developing RLHF before realizing that DPO is simpler and better what are we working on rn? i want to find the simpler and better version and work on that instead
37
50
710
@jxmnop
jack morris
7 months
the saga of a phd student > last week, have ground breaking research idea (so i think, anyway) > two days ago, write sloppy prototype code for idea > yesterday, run experiments, collect results > wow, results surprisingly good > this is so cool, that was easy > i love grad
17
6
686
@jxmnop
jack morris
2 months
you're telling me an 8B param model was trained on fifteen trillion tokens? i didn't even know there was that much text in the world really interesting to see how scaling laws have changed best practices; GPT-3 was 175 billion params and trained on a paltry 300 billion tokens
Tweet media one
41
36
682
@jxmnop
jack morris
4 months
implemented a fast, GPU-enabled BM25 in pytorch! BM25 is a simple search algorithm from the 70s that works as well as neural networks for most search problems; for all the advances we've made in neural text retrieval, it's still around got near SOTA on stanford LoCO benchmark
Tweet media one
9
62
649
@jxmnop
jack morris
6 months
people keep saying AI is moving so fast. some days I agree, but some days I'm not sure – so many papers published, but I don't feel like we're making that many fundamental breakthroughs. to cap off 2023, here's a list of things we still don't know about language models: - how
26
65
628
@jxmnop
jack morris
6 months
fun research story about how we jailbroke the the chatGPT API: so every time you run inference with a language model like GPT-whatever, the model outputs a full probabilities over its entire vocabulary (~50,000 tokens) but when you use their API, OpenAI hides all this info from
17
65
610
@jxmnop
jack morris
7 months
cool research idea for someone: text diffusion in embedding space solve any sequence-to-sequence task in three steps: 1. embed source sentence text 2. build diffusion model that maps input text embedding to target text embedding 3. invert to produce target text
Tweet media one
27
57
610
@jxmnop
jack morris
3 months
since i got into ML in 2017 or so there have been exactly three major advancements: 1. web-scale language model pretraining (OpenAI, 2020) 2. diffusion models (OpenAI, 2020) 3. language model refinement via human feedback (OpenAI, 2022) what’s next? it’s been a while
68
23
607
@jxmnop
jack morris
4 months
achievement unlocked: OpenAI changes its API because of your research ✅ you can still get logits from GPT-4 by using the argmax bisection method detailed in Language Model Inversion. it's pretty expensive, though we should have stockpiled logits while we had the chance...
Tweet media one
11
36
603
@jxmnop
jack morris
5 months
I'm teaching a little workshop this semester at cornell tech called Practical Deep Learning! it's supposed to be an intro to training neural networks and debugging them, in a modern context this is the schedule for the course. will post content here:
Tweet media one
16
68
600
@jxmnop
jack morris
2 years
Diffusion is just an easy-to-optimize way to give neural networks adaptive computation time. Makes sense then that diffusion models beat GANs, which only get one forward pass to generate an image. have to wonder what other ways there are to integrate for loops into NNs...
24
49
602
@jxmnop
jack morris
4 months
github lets you have multiple READMEs now. actually a really nice new feature
4
35
557
@jxmnop
jack morris
4 months
daily deep learning workout: train two CNNs and three RNNs. perform ten minutes of quantized LLM transformer inference. write CUDA kernels until failure
19
40
550
@jxmnop
jack morris
5 months
hyperbolic embeddings are insane
Tweet media one
19
51
551
@jxmnop
jack morris
1 year
Just uploaded a dataset to huggingface of every line from every episode of "The Office" (US). There are ~60k lines of text, each annotated with season/episode/scene as well as the name of the speaker. Excited to see what people do with it!! ☕️ 👔🗒️
Tweet media one
Tweet media two
26
39
535
@jxmnop
jack morris
3 months
this guy works for Boeing
@Yampeleg
Yam Peleg
3 months
@jxmnop Unpopular opinion: lower quality but faster code is actually much better code
8
0
25
12
12
518
@jxmnop
jack morris
10 months
if floating-point math was associative we would have had AGI back in 2016
Tweet media one
27
31
501
@jxmnop
jack morris
7 months
my first ever outstanding paper award 😁😁
Tweet media one
Tweet media two
40
16
514
@jxmnop
jack morris
6 months
i used to find this kind of thing intimidating - ML is so popular, competitive, etc but I realized how different every paper is. each presenter here traveled down one single esoteric rabbit hole high-dimensional spaces are so sparse; there's still plenty of room for your ideas
@BramGrooten
Bram Grooten
7 months
Poster sessions @NeurIPSConf
17
108
897
11
20
505
@jxmnop
jack morris
5 months
visiting some mathematician friends who work on language models and i'm struck by how different the lifestyle is. their workspaces are mountains of papers surrounded by endless blackboards. they talk of mutual info, topological theory, PAC bounds. i am completely covered in chalk
14
19
488
@jxmnop
jack morris
3 months
Diffusion Lens is a pretty neat new paper, you can see a text-to-image encoder's representation of a giraffe getting less and less abstract with every layer 🦒
Tweet media one
5
55
493
@jxmnop
jack morris
5 months
if you're doing research on text embeddings, you know that there are lots of tricks required to train a good model, and no open datasets. we trained Nomic Embed, a great text embedding model, and actually released the data! certainly will make my research easier – check it out!
@nomic_ai
Nomic AI
5 months
Introducing Nomic Embed - the first fully open long context text embedder to beat OpenAI - Open source, open weights, open data - Beats OpenAI text-embeding-3-small and Ada on short and long context benchmarks - Day 1 integrations with @langchain , @llama -index, @MongoDB
38
274
2K
4
39
463
@jxmnop
jack morris
5 months
I think I found the first evidence of a collision in text embeddings! i.e. for an embedding function f, i found a text pair x, y where x ≠ y but emb(x) = emb(y) ok granted it isn't thatttt interesting but what I did was repeat the word purple from 1 to 8191 times. 'purple ' is
Tweet media one
Tweet media two
Tweet media three
Tweet media four
22
34
463
@jxmnop
jack morris
3 years
just remembered that a neural network (CLIP) actually made this mistake lol
Tweet media one
16
48
446
@jxmnop
jack morris
3 months
ok before one of you tries to assassinate me for building AGI, i figured out the bug. at least it's an interesting one 🥲☔️ so we're doing contrastive learning, which is a matching task between (query, document) pairs query 1 matches to document 1, query 2 matches to document
Tweet media one
@jxmnop
jack morris
3 months
okay what 99.99% chance there's a bug in my code, but 0.01% chance i just solved text retrieval
Tweet media one
65
34
2K
29
19
453
@jxmnop
jack morris
4 months
random research idea: Latent Text Tansformer (LTT) in a nutshell: replace sequence of *vectors* as hidden states of the transformer with sequences of *tokens*, so we can read the model's "thoughts" directly 🤖 then train a transformer that uses longer sequences of *discrete
Tweet media one
20
57
449
@jxmnop
jack morris
2 months
i think people have generally taken this blog post (The Bitter Lesson by Richard Sutton) far too seriously. some reminders: - you CAN come up with clever ideas and show that they work without a bazillion gpus - you CAN build useful systems at the 100M parameter scale - if you
@polynoamial
Noam Brown
2 months
I wish every grad student in AI would read The Bitter Lesson
23
53
476
13
32
449
@jxmnop
jack morris
1 year
• hurt my leg last week • got an MRI • received email with radiologist's report • too much medical terminology, totally inscrutable, could not read • fed radiology report to chatGPT • asked chatGPT to explain report in plain English • found out i broke my leg from chatGPT
16
8
443
@jxmnop
jack morris
7 months
fun idea I tested out this morning: Language model fine-tuning in embedding space here's the idea: learn a model of *embeddings* of a certain text distribution; then, to generate text, sample embedding and map back to text with vec2text this lets us generate language without
Tweet media one
Tweet media two
24
36
431
@jxmnop
jack morris
4 months
class tonight – send me your best visualizations of multimodal embeddings & architectures (especially CLIP)
Tweet media one
Tweet media two
Tweet media three
Tweet media four
11
32
424
@jxmnop
jack morris
3 months
is anyone still using encoder-decoder models (T5, BART, etc.)? if so -- for what?
94
18
424
@jxmnop
jack morris
4 months
project request: eigenfaces, but for text do PCA on text embeddings from a particular domain and find the "eigentext" embeddings that represent that domain then project back to text with vec2text. what would the eigentexts look like? surprised no one has done this yet
Tweet media one
34
30
419
@jxmnop
jack morris
6 months
there's something sinister about these two sharing a stage—crossover episode featuring two of the most prominent pseudointellectual grifters. they manage to emulate the structure of intelligent conversation while really saying nothing at all. hoping for better role models in 2024
@lexfridman
Lex Fridman
6 months
Here's my conversation with Guillaume Verdon ( @GillVerd ) aka Beff Jezos ( @BasedBeffJezos ), a physicist, quantum computing researcher, and founder of e/acc (effective accelerationism) movement that advocates for rapid technological progress, physics-based reasoning, and memes.
383
549
4K
30
21
402
@jxmnop
jack morris
5 months
wrote some faster dataloading code for HuggingFace datasets – sped up datasets.load_from_disk() from 17 minutes to around 30s the problem I was running into is that our virtual disk at school is really slow. HF loads datasets in a single thread and doesn't always memory-map.
Tweet media one
8
26
403
@jxmnop
jack morris
10 months
Little things I learned when implementing the LLAMA forward pass from scratch with @lionellevine : - ROPE positional embeddings are tricky and weird and I'm convinced fewer than 100 people really understand them - the residual is added *twice* per layer (after self-attention and
15
31
403
@jxmnop
jack morris
3 months
free startup idea: Reverse Matplotlib, software that inputs figures and outputs code I often find myself in a position where I have a graph like this one, and want to make minor changes, but the raw data is hard to regenerate need a tool that can recreate the code & data for me
Tweet media one
31
12
393
@jxmnop
jack morris
4 months
today i'm making voronoi diagrams of text embedding spaces what can we do with these?
Tweet media one
Tweet media two
33
16
391
@jxmnop
jack morris
4 months
this is a great and really unique description of how SVD works. never heard this before (from )
Tweet media one
10
41
384
@jxmnop
jack morris
7 months
language models encode information in their *weights* while embeddings encode information in their *activations* this distinction is important, possibly somewhat profound
24
20
373
@jxmnop
jack morris
8 months
prediction: every vector database will eventually be replaced with a single transformer a row for every individual datapoint is so old-school, feels outdated what would be better: a single differentiable blob, something that “knows” about all your data and can chat about it
37
7
371
@jxmnop
jack morris
1 year
Request: “Reverse Copilot”: an AI program that reads my code and writes comments for functions, classes, etc. I’ve written so much code that I was too lazy to document, and probably will never go back to do so Bonus points if the bot just makes pull requests to my Github repos!
43
20
367
@jxmnop
jack morris
7 months
i can see it now: 2024 will be remembered as the year BPE died tokenization is by far the clunkiest part of a transformer; one last remaining bit of inelegance in an otherwise hyperoptimized model architecture time for it to go
19
12
368
@jxmnop
jack morris
7 months
machine learning research question: what’s an idea that you think would catch on, if only someone spent the money to test it at scale? i’ll go first: tokenization-free transformers
51
14
357
@jxmnop
jack morris
1 month
passed my A exam and got my master's degree; officially halfway through my phd! cheers everyone
Tweet media one
18
0
341
@jxmnop
jack morris
5 months
As an exercise in open science, gonna tweet the research problem I’m stuck on: i want to align two text embedding spaces in an unsupervised way. The motivation is that in my previous vec2text work, we have to know the embedding model and be able to query it. this is fine in
51
25
335
@jxmnop
jack morris
1 month
"What I cannot create, I do not understand" –Richard Feynman "What I can create, I do not understand" –Yoshua Bengio, Geoffrey Hinton, and Yann Lecun
11
22
330
@jxmnop
jack morris
6 months
0
0
325
@jxmnop
jack morris
3 months
a common shared experience of the research process seems to be "getting stuck" what am i supposed to do when i'm stuck on a problem? read papers? write unit tests? meditate, go for a walk? or should i just think harder? (please help)
127
4
324
@jxmnop
jack morris
2 years
I've retweeted a lot of cool art pieces recently that were generated with AI. over the last couple months I put together a blog post explaining where these techniques originated, who's behind it all, and how AI-generated art suddenly got so good:
7
74
320
@jxmnop
jack morris
6 months
i’m curious about effective altruism: how do so many smart people with the goal “do good for the world” wind up with the subgoal “analyze the neurons of GPT-2 small” or something similar?
44
11
318
@jxmnop
jack morris
8 months
from last night: me with my embeddings
Tweet media one
8
6
303
@jxmnop
jack morris
5 months
one exciting observation about transformers (and most modern deep learning) is that you can understand them using high school math. really just multiplication, division, sums, and exponentiation, many times, and in a strange and initially hard-to-grok order
13
13
299
@jxmnop
jack morris
9 days
an underdiscussed gotcha behind the “search + LLM = AGI” narrative is search is only valuable when statewide improvements are *quantifiable* this is the case in Go, and coding problems w/ tests, and this ARC benchmark. we can explore the (LLM-generated) state space and leverage
@bshlgrs
Buck Shlegeris
9 days
ARC-AGI’s been hyped over the last week as a benchmark that LLMs can’t solve. This claim triggered my dear coworker Ryan Greenblatt so he spent the last week trying to solve it with LLMs. Ryan gets 71% accuracy on a set of examples where humans get 85%; this is SOTA.
Tweet media one
46
178
1K
49
23
312
@jxmnop
jack morris
7 months
Sasha gave this amazing talk on our language model inversion research! 1. text embedding inversion () 2. language model output inversion (coming soon...)
Tweet media one
Tweet media two
Tweet media three
Tweet media four
@srush_nlp
Sasha Rush
7 months
Talk: Inverting Language Models (by @jxmnop ) Techniques for extracting text from vector databases and prompts from LLM APIs.
3
23
205
2
35
293
@jxmnop
jack morris
8 months
is there anything novel or technically interesting about this model besides the fact that it outputs swear words?
@xai
xAI
8 months
Announcing Grok! Grok is an AI modeled after the Hitchhiker’s Guide to the Galaxy, so intended to answer almost anything and, far harder, even suggest what questions to ask! Grok is designed to answer questions with a bit of wit and has a rebellious streak, so please don’t use
8K
8K
50K
38
0
277
@jxmnop
jack morris
5 months
carbon dating ML papers by which open-source LM they use GPT-2 → the before times GPT-J → summer 2022 LLAMA-1 → spring 2023 LLAMA-2 → summer 2023 mistral 7b → fall 2023 mixtral 8x7b → the current era
5
16
275
@jxmnop
jack morris
10 months
the truth about machine learning software engineering in academia is that everyone is running commands like this: ``` TOKENIZERS_PARALLELISM=false CUDA_VISIBLE_DEVICES=0 stdbuf -oL -eL ^Cthon train_phase2_patchmlp.py --train_path data/${FOLDER}/src1_train.txt --val_path
15
12
276
@jxmnop
jack morris
2 years
"Punctuation restoration" is a neat NLP task that I hadn't heard of before today model from shoutout @huggingface , I often stumble across cool models like this on the model hub!!
Tweet media one
12
43
270
@jxmnop
jack morris
4 months
@quickdwarf i think extrapolating that 2024 will bring at least one advancement in AI is a pretty reasonable bet
2
0
270
@jxmnop
jack morris
3 years
life update: I'm getting a PhD! I'll be joining @srush_nlp 's group next year to do machine learning & NLP research at Cornell Tech!
16
6
266
@jxmnop
jack morris
2 months
in my opinion, the Phi approach to training language models is just wrong • i'm not convinced that training on less (albeit "higher-quality") data is better than training on as much data as possible • i'm not convinced that training on synthetic data ever works better than
33
5
262
@jxmnop
jack morris
6 months
on podcasts: lex fridman brings on a wonderful selection of guests (mostly) but doesn’t research them beforehand and resorts to asking them things like “do u think aliens are real” for people asking for a replacement, I think @dwarkesh_sp is great and has spectacular guests too
26
6
258