Joel Kronander @jkronand profile

Joel Kronander

@jkronand

Followers

3,093

Following

998

Media

75

Statuses

2,149

I try to learn something every day✨Former Head of ML at Nines, Former Head of Synth data at Scale AI 💫

Palo Alto, CA

Joined March 2008

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

Olympics • 1752452 Tweets

Venezuela • 838403 Tweets

#OpeningCeremony • 716835 Tweets

オリンピック • 384467 Tweets

Celine Dion • 348515 Tweets

Francia • 280808 Tweets

The French • 261581 Tweets

フランス • 165741 Tweets

Gojira • 158378 Tweets

Christians • 130204 Tweets

Satanic • 127340 Tweets

Last Supper • 122321 Tweets

Christianity • 80509 Tweets

Marie Antoinette • 63535 Tweets

Halsey • 45117 Tweets

セーヌ川 • 34715 Tweets

マリーアントワネット • 32535 Tweets

東京五輪 • 30525 Tweets

#Smackdown • 28127 Tweets

Jordan Love • 27789 Tweets

ミニオン • 24328 Tweets

Britney • 23227 Tweets

隅田川花火大会 • 23206 Tweets

Mahoma • 20626 Tweets

Sodoma y Gomorra • 20394 Tweets

エッフェル塔 • 18887 Tweets

Packers • 18756 Tweets

聖火の人 • 16112 Tweets

愛の讃歌 • 13072 Tweets

QB in NFL • 12291 Tweets

Maomé • 11600 Tweets

Senga

पूर्व राष्ट्रपति

Jacob Fatu

シュヴェルトリリエ

ジャンヌダルク

Aaron Judge

अब्दुल कलाम

Charlie Hebdo

Yoifer

メタルホース

Kimbrel

Trey Nyoni

#GoPackGo

ぬーどるストッパーの陣

#BP2MIKanadaJalinKerjasama

スカパラ

#グラブルEXフェス2024

$PUSH

Clay Holmes

Last Seen Profiles

@roguechimp99

@Tsumugi69458619

@izavarise

@cherylhart00

@NetshoesESports

@lojas_colombo

@Divakar_reddy_

@juliogracia35

@mizigoto

@carolvorders

@RAARnookamoto

@GameOverTalent

@Wmafia6

@princessnothere

@BelfastmetRFC

@nanahoshi_lynx

@Urugay18

@FarjadSahir

@BellasArtesAR

@AdriawanFarmana

Joel Kronander

@jkronand

1 year

I recently demonstrated GPT4 to my spouse's 101-year-old grandfather, who remains in excellent health and has a sharp mind. Following my demonstration, he paused thoughtfully and then said something I will remember — “This technology instills hope for our future. It's high time

41

121

1K

Joel Kronander

@jkronand

10 months

Stanfords DSPy is the best high level LLM programing framework I have seen this far. Langchain never resonated with me; despite being an early LLM framework, its design and abstractions felt overly complex. DSPy, on the other hand, is a huge step in the right direction. DSPy

11

95

727

Joel Kronander

@jkronand

1 year

An interesting new Nature paper compares fMRI recordings with activations across layers in a language model, and find evidence of correlations. The study seems to suggests that brain regions located at the top of the language hierarchy, responsible for

Evidence of a predictive coding hierarchy in the human brain listening to speech

Nature Human Behaviour - Current machine learning language algorithms make adjacent word-level predictions. In this work, Caucheteux et al. show that the human brain probably uses long-range and...

www.nature.com

21

134

635

Joel Kronander

@jkronand

1 year

Six years ago, Geoffrey Hinton asserted that AI would take over radiology within five years, suggesting we cease training radiologists. Was he correct? The situation is more complex than simply being right or wrong. While AI has surpassed radiologists in certain diagnostic

28

55

464

Joel Kronander

@jkronand

1 year

Deep learning is typically bottlenecked by memory not compute ⚡️Flash Attention ⚡️ optimizes transformers, like GPT, to minimize costly GPU memory fetches and achieves impressive speedups of 2-4x, 5-20x less memory intensive, and enables scaling to longer

5

50

287

Joel Kronander

@jkronand

1 year

Self-consistency is underrated for improving accuracy for LLMs in a range of reasoning and arithmetic tasks. It works with any off-the-shelf LLM, eg GPT3 variants, and also provides estimates of how certain the LLM is of the provided answer. Takeaways👇

7

49

268

Joel Kronander

@jkronand

1 year

A simple trick to make LLMs “calibrated” — ie “to know when it doesn’t know something” — is to reformulate the answers as a single word or a short phrase, and look at the predicted logprobs of the word. As LLMs are trained to predict the probability of the next token, they are

8

34

261

Joel Kronander

@jkronand

1 year

🤖️LLM can self-improve 🧠 1) Self-consistency boosts reasoning skills by sampling multiple paths & finding the most consistent answer But more samples = more comp. requirements. 💻 2) but we can train better LLM with self-generated solutions from 1)

Large Language Models Can Self-Improve

Large Language Models (LLMs) have achieved excellent performances in various tasks. However, fine-tuning an LLM requires extensive supervision. Human, on the other hand, may improve their...

arxiv.org

7

55

249

Joel Kronander

@jkronand

1 year

What it you had trained a model to play legal moves in Othello by predicting the next move, and found that it had spontaneously learned to compute/represent the full board state in it's weights - an emergent world representation? That's just what this

Large Language Model: world models or surface statistics?

A mystery Large Language Models (LLM) are on fire, capturing public attention by their ability to provide seemingly impressive completions to user prompts (NYT coverage). They are a delicate combin...

thegradient.pub

7

33

199

Joel Kronander

@jkronand

1 year

Insightful paper that succinctly covers essential high-level knowledge to keep in mind regarding LLMs: - Large language models (LLMs) predictably improve with increasing investment, but many key behaviors emerge unpredictably. - LLMs often learn and use representations of the

7

15

158

Joel Kronander

@jkronand

1 year

3

19

131

Joel Kronander

@jkronand

1 year

✨Neat LLM trick for 📈 math & logical abilities ✨ Improves on Chain of Thought (CoT) prompting by 1) Replace natural language, step by step instructions, in CoT examples with commented, stepwise, python code. 2) Run the code Several recent papers on this (see refs below⬇️)

4

24

135

Joel Kronander

@jkronand

2 years

Want to know a simple trick for LLMs to generate more plausible long documents, breaks out of repetition better, and more reasonably truncate low probability tokens? Learn about LLM truncation sampling! Some takeaways from 👇🧵

3

16

122

Joel Kronander

@jkronand

10 months

@ylecun GPT4 would never make that mistake.

4

2

116

Joel Kronander

@jkronand

1 year

LLMs suffer from overconfidence and poorly calibrated uncertainty estimates However, self-consistency, where on samples multiple paths & finds the most consistent answer, seems to offer a practical solution. Interesting figure from page 4 in "LLMs can self improve" paper

2

8

111

Joel Kronander

@jkronand

2 years

A fun trick for zero shot retrieval tasks with great results! First use a off the shelf LLM to generate a set of hypothetical candidate document, then use standard embedding model + standard search to find best matching documents in DB/Web. Details 👇

Precise Zero-Shot Dense Retrieval without Relevance Labels

While dense retrieval has been shown effective and efficient across tasks and languages, it remains difficult to create effective fully zero-shot dense retrieval systems when no relevance label is...

arxiv.org

3

13

107

Joel Kronander

@jkronand

1 year

Distill step by step! A new research paper from Google presents a straightforward concept that let’s them train a 770M T5 model that surpasses the 540B PaLM model, using just 80% of the available data on a benchmark task. Essentially distills (trains) a smaller model from the

0

21

102

Joel Kronander

@jkronand

1 year

@tunguz In your dreams. I think something that big is more like 3-4 million in reality.

4

1

82

Joel Kronander

@jkronand

9 months

@bio_bootloader Now compare GDP growth :-)

2

0

72

Joel Kronander

@jkronand

1 year

Bad future: A single AI, RLHFd into “alignment” to a very narrow set of values determined by a very small set of people. Good future: Democratized multiple AIs that is reasonably regulated, inducing diversity of thought, applied towards medicine, science and wisdom. A society

2

7

70

Joel Kronander

@jkronand

1 year

@BillAckman @PeterHotez Mr. Ackman, @BillAckman , I appreciate your lengthier posts and believe you can be a thoughtful individual. However, in this specific instance of sharing out-of-context clips with voiceovers, it appears to be an incredibly ineffective method of uncovering the truth. While I don't

22

4

67

Joel Kronander

@jkronand

1 year

The benefits of AGI are often associated with accomplishments such as "curing cancer" or other medical breakthroughs. However, it appears that relativity few people are actively working on AI specifically for medical applications. Instead, researchers in leading labs seem to be

22

4

59

Joel Kronander

@jkronand

9 months

It's intriguing to observe how several alt accounts, like @tszzl , @BasedBeffJezos , @AISafetyMemes etc, gain traction and influence a significant portion of prominent figures in the AI/tech industry, often shaping the direction of discussions. Shapers of collective consciousness.

5

2

48

Joel Kronander

@jkronand

1 year

Isn't it quite mind-boggling that the majority of humanity's collective thoughts and reasoning, in broad strokes, seem to be compressible down to just a few hundred gigabytes?

10

8

46

Joel Kronander

@jkronand

9 months

@jeremyphoward Also given their history of changing their structure - what is really preventing them from changing it again at a later point?

2

45

Joel Kronander

@jkronand

1 year

@BillAckman Are you helping by posting this compilation of out of context clips from various interviews? why is it bad that scientists update their belief and advice when facts come in? I don’t know much about this particular person, but the personal attacks on him for trying to navigate,

40

0

36

Joel Kronander

@jkronand

1 year

@bio_bootloader Keep parents on alert at all times

0

40

Joel Kronander

@jkronand

1 year

@emollick We got to hide this from the AGI

2

1

36

Joel Kronander

@jkronand

1 year

@bag_of_ideas Interesting point. I haven’t thought about it this way before — it’s also my belief that on average people are good.

0

1

30

Joel Kronander

@jkronand

2 years

Wonderful short survey of Graph Neural Networks (GNN). Three types of principal tasks - classification of Nodes, Graphs and Link prediction. Deep sets and Transformers as GNNs, geometric graphs and more!

0

7

30

Joel Kronander

@jkronand

1 year

Concise reference sheet for some of the most practical prompt techniques for improving LLM on math and reasoning tasks.

John Nay

@johnjnay

1 year

Paper below has a good summary of the base techniques that work across domains:

0

4

38

1

5

31

Joel Kronander

@jkronand

1 year

@dmvaldman I wonder how many custom NLP models out there easily could be replaced with an openai embedding and a linear classifier....

1

0

27

Joel Kronander

@jkronand

1 year

@karpathy I remember taking my first graduate level machine learning course back in 2009 — and I got completely obsessed. Bishops book on ML was my bible for a time, still good book!

0

1

28

Joel Kronander

@jkronand

1 year

Re-reading a few chapters from my favorite ML/stats book! Beautifully written, peppered with deep insights, and dosent shy away from the math, but doesn’t complicate things unnecessarily Also you can get a free pdf here!

1

29

Joel Kronander

@jkronand

1 year

Hard to predict exactly when, but seems likely text2video with stable diffusion like quality will happen sometime in the next 3 years. Could be in 5 months could take a bit longer - but it’s likely going to happen relatively soon. Lets make sure our defences vs misinformation are

5

0

26

Joel Kronander

@jkronand

1 year

In the paper they showed a 2 layer network is needed for a non-linear probe to extract, and modify, the board state. But in brilliant follow up work, seems to indicate you can actually just use a linear probe, also great read!

1

27

Joel Kronander

@jkronand

1 year

@koenvaneijk It says “open”ai on the box?

5

0

24

Joel Kronander

@jkronand

1 year

ChatGPT + stable diffusion is a pretty great condensed representation of humanity. Good candidate to send on the next deep space voyager probe as a greeting to any alien races out there. Aliens will learn that we are very self confident and often wrong!

0

6

26

Joel Kronander

@jkronand

1 year

Uses simple but clever tricks like blocking/tiling and cuda kernel fusion. Also recomputes the attention matrix dynamically in the backward pass instead of fetching it from memory. Beautiful example of impressive gains from clever engineering.

GitHub - Dao-AILab/flash-attention: Fast and memory-efficient exact attention

Fast and memory-efficient exact attention. Contribute to Dao-AILab/flash-attention development by creating an account on GitHub.

github.com

1

25

Joel Kronander

@jkronand

9 months

“My AI will circle back to yours to hash out the details”

2

5

24

Joel Kronander

@jkronand

2 years

@leopd Banning GPU sales to China increased the risk of a conflict around Taiwan unfortunately.

2

0

24

Joel Kronander

@jkronand

1 year

What’s the best paper investigating the effect of the order of training data fed to LLM during training? Like keep only high quality content for later in training? Obv works for finetuning. But looking for a more generalized form.

5

4

22

Joel Kronander

@jkronand

1 year

From the GPT4 'paper' there is an interesting figure on how the base model is initially well calibrated on MMLU, but then after RLHF becomes much less so. Does anyone know of more studies of how RLHF affects model calibration on various tasks?

2

3

23

Joel Kronander

@jkronand

1 year

@natfriedman Harder math&science problems. Eg I find GPT4 reliabily correctly solves exercises, with no public answers, in most of my graduate textbooks in math, physics, ml etc. it’s astounding if you actually try it.

3

2

22

Joel Kronander

@jkronand

1 year

@ylecun 6 months is basically nothing in the grand scheme of things. It’s rather irrelevant for technological progress. Seems reasonable to let the public catch up before people like you decide what should be done without consulting them first.

1

22

Joel Kronander

@jkronand

1 year

Fantastic talk summarizing recent progress in RLHF from the research lead of chatGPT.

John Schulman - Reinforcement Learning from Human Feedback: Progress...

EECS Colloquium Wednesday, April 19, 2023Banatao Auditorium5-6p

www.youtube.com

0

3

20

Joel Kronander

@jkronand

1 year

@KevinAFischer You can also see this effect directly in experimental data The diversity of thought goes down in favor of the one “aligned” solution.

Joel Kronander

@jkronand

1 year

From the GPT4 'paper' there is an interesting figure on how the base model is initially well calibrated on MMLU, but then after RLHF becomes much less so. Does anyone know of more studies of how RLHF affects model calibration on various tasks?

2

3

23

2

0

20

Joel Kronander

@jkronand

11 months

@tunguz Does anyone use integrals anymore?

0

21

Joel Kronander

@jkronand

1 year

@naval Extremely risky bet from Google. That statement dosent even incorporate what will happen to cost and capability of AI models in 1,2,3 years etc. traditional search won’t change that much. Google has an extremely difficult innnvenitors dilemma to navigate. In an organization

0

1

18

Joel Kronander

@jkronand

1 year

@jackythirdy Someone who pays 10$/month for Copilot? Also vscode ships with a dark theme by default so it’s democratizing the mythical 10x engineer :-)

0

20

Joel Kronander

@jkronand

1 year

@AlexanderRKlotz @DanielleFong Now compare to the list for four dimensions.

3

0

17

Joel Kronander

@jkronand

1 year

“It seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers… They would be able to converse with each other to sharpen their wits. At some stage therefore, we should have to expect the machines to take control.” -

3

1

18

Joel Kronander

@jkronand

1 year

The first person to use the concept of a "singularity" in the technological context was John von Neumann. Stanislaw Ulam reports a 1958 discussion with von Neumann "centered on the accelerating progress of technology and changes in the mode of human life, which gives the

3

5

18

Joel Kronander

@jkronand

1 year

Great example of democratizing AI! A Stanford Alpaca type LLM tuned for Italian instruction following. Go make one for the preferred language of your choice!

Andrea Santilli

@teelinsan

1 year

I'm excited to introduce you Camoscio: an Italian instruction-tuned LLaMA, following Stanford Alpaca. The model should provide output of similar quality to GPT text-davinci-003 and has been finetuned by translating the Alpaca dataset to Italian. 1/3

8

40

180

0

1

18

Joel Kronander

@jkronand

10 months

@Teknium1 Translate a regular one to python using you favorite LLM?

1

17