Christian Szegedy @ChrSzegedy profile

Christian Szegedy

@ChrSzegedy

Followers

34,398

Following

2,446

Media

235

Statuses

7,544

#deeplearning , #ai research scientist. Opinions are mine.

Sunnyvale, CA

Joined June 2015

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

Wizkid • 242906 Tweets

Davido • 220658 Tweets

FAYEYOKO BTS 1ST FANMEET • 113034 Tweets

からくりサーカス • 105539 Tweets

ケンタッキー • 104195 Tweets

BTS NUNEW 1ST CONCERT EP2 • 101002 Tweets

ニンテンドーミュージアム • 90217 Tweets

Supreme God Kabir • 78014 Tweets

WEDance WITH JARLETTE • 74814 Tweets

#リゼロ • 55533 Tweets

青木選手 • 51499 Tweets

引退試合 • 50763 Tweets

フォーエバーヤング • 50596 Tweets

ヤクルト • 40275 Tweets

青木さん • 30955 Tweets

Alcaraz • 28834 Tweets

#Aぇヤンタン • 24154 Tweets

Cassper • 21841 Tweets

Shana Tova • 21630 Tweets

ラムジェット • 17844 Tweets

שנה טובה • 16153 Tweets

ダービー • 14225 Tweets

ジャパンダートクラシック • 13763 Tweets

SPILL THE FEELS TRACK LIST • 10960 Tweets

やねるねるねー

バレンティン

あかりちゃん

大鶴肥満

軍貫新規

セレッソ

うしおととら

銀山温泉

Robbie Williams

Berlimpah HADIAH

石破内閣の支持率50.7

$BROCK

ミッキーファイト

LEGENDARY ARTIST BANG CHAN

顔の下半身

おつかなた

イチロー

ITZY IS THE MOMENT

エバヤン

引退セレモニー

全校集会

すわほー

レディクレ

#水曜日のダウンタウン

#المعتقد_الاسلامي

#سكليف_نصاب_0ち91271ち20

Last Seen Profiles

@crushCSGO

@studeraky

@ingyalaa97

@cktears

@esmeraldaayala7

@AndreaKinley545

@PkarakusAkalin

@Hoosier_Donn

@cathlab

@headlineclub

@Savascibirisi

@QuintonSharkey4

@frognotten

@Val_m_p

@EnDefensaC

@ArdenBegaj

@dema_amed

@SENG_IE

@AlexMoskov

Christian Szegedy

@ChrSzegedy

2 years

Deep learning is not just memorization, it also compresses. And really good compression amounts to intelligence. That's why we get few/zero-shot capabilities from LLMs, since they "discovered" a lot of the underlying patterns just by being trained to compress.

François Chollet

@fchollet

2 years

Deep learning takes data points and turns them into a query-able structure that enables retrieval and interpolation between the points. You could think of it as a continuous generalization of database technology.

39

467

3K

33

98

912

Christian Szegedy

@ChrSzegedy

11 months

"Not too bad from a bunch of monkeys"

Elon Musk

@elonmusk

11 months

For the first time, there is a rocket that can make all life multiplanetary. A fork in the road of human destiny.

18K

31K

356K

32

35

883

Christian Szegedy

@ChrSzegedy

2 years

New jobs in the 21st century: Model restart specialist Hyperparameter psychic Prompt engineer Model janitor Tensor shape mediator Quantum state observer Model footprint accountant

Awni Hannun

@awnihannun

2 years

The paper mentions 35 (!) manual restarts to train OPT-175B due to hardware failure (and 70+ automatic restarts).

2

11

86

20

97

715

Christian Szegedy

@ChrSzegedy

11 months

I talked to several non-ML people over the past half year (including my kids) about chat bots and what i feel is a general frustration about their being moralizing and not getting answers even to relatively innocent prompts. So being more unhinged will be a feature not a bug.

Shibetoshi Nakamoto

@BillyM2k

11 months

i predict after the xAI software grok comes out, there will be an excessive amount of articles about how sexist / racist / hate speech / blah blah blah it is, by people financially and socially motivated to make articles like that

206

76

1K

26

29

648

Christian Szegedy

@ChrSzegedy

2 years

User: "When was Einstein Born?" LLM: .. Let's revisit a compressed form of all existing knowledge on the web ... once for *every* single input token ... and each token generated, including the GDP of Armenia, the name of all kings ever lived and all famous chess games, etc ...

27

32

641

Christian Szegedy

@ChrSzegedy

3 years

Here is a thread in paper references demonstrating that current deep learning, especially transformers are amazingly powerful for symbol manipulation already: @GuillaumeLample , @f_charton Solving hard integrals using deep transformers 🧵1/n

Deep Learning for Symbolic Mathematics

Neural networks have a reputation for being better at solving statistical or approximate problems than at performing calculations or working with symbolic data. In this paper, we show that they...

arxiv.org

Joscha Bach

@Plinz

3 years

"Deep Learning is hitting a Wall" — Gary Marcus

272

460

3K

11

127

570

Christian Szegedy

@ChrSzegedy

28 days

The most sincere from of flattery is when something seems so outrageous that most (knowledgable) people don't even believe it.

The Information

@theinformation

28 days

AI Agenda: Why Musk’s AI Rivals Are Alarmed by His New GPU Cluster Elon Musk claims to have finished a 100,000-strong H100 cluster in four months. How likely is that? From @anissagardizy8

12

20

91

15

37

562

Christian Szegedy

@ChrSzegedy

28 days

My view on this has not changed in the past eight years: I have given many talks and written position paper in 2019 (link below). Progress is faster than my past expectation. My target date used to be ~2029 back then. Now it is 2026 for a superhuman AI mathematician. While a

Gary Marcus

@GaryMarcus

28 days

@ChrSzegedy @colin_fraser @Lang__Leon I am less sanguine but would love to hear your thoughts in more detail if you can share. @ChrSzegedy

1

0

5

19

75

548

Christian Szegedy

@ChrSzegedy

11 months

I wonder what Sam's Grok thinks of ours... ;)

29

25

510

Christian Szegedy

@ChrSzegedy

11 months

I think Yann might underestimate the potential of AI if people have API access to strong generative AI. LLMs are capable of generating code which could be executed *automatically* by *anyone* without any human *oversight*, also in a loop and open-endedly. This is very hard to

Yann LeCun

@ylecun

11 months

@geoffreyhinton One thing we know is that if future AI systems are built on the same blueprint as current Auto-Regressive LLMs, they may become highly knowledgeable but they will still be dumb. They will still hallucinate, they will still be difficult to control, and they will still merely

168

292

2K

42

62

467

Christian Szegedy

@ChrSzegedy

11 months

Amazing what a very small team can do in a few months.

ibab

@ibab

11 months

We've released our first progress update at xAI.

76

104

1K

18

23

287

Christian Szegedy

@ChrSzegedy

9 months

Happy holidays!

14

30

229

Christian Szegedy

@ChrSzegedy

10 months

It is a bit ironic since both Yann and Geoff made their carreers by ignoring most of the "expert opinions" on the uselessness of deep learning for several decades.

Geoffrey Hinton

@geoffreyhinton

10 months

Yann LeCun thinks the risk of AI taking over is miniscule. This means he puts a big weight on his own opinion and a miniscule weight on the opinions of many other equally qualified experts.

652

505

4K

13

25

373

Christian Szegedy

@ChrSzegedy

11 months

Touche... :)

22

13

428

Christian Szegedy

@ChrSzegedy

8 months

In the past few years (since 2016) i got increasingly convinced that retrieval augmanted generation is the most central problem of strong general AI. Despite the amazing work done in the past 7 years, it is still the most central question. Just look at what science is about:

36

43

405

Christian Szegedy

@ChrSzegedy

1 year

2013: solving atari 2016: mastering go 2019: mastering starcraft 2022: vastly improved protein folding 2023: stating the obvious

16

26

378

Christian Szegedy

@ChrSzegedy

1 year

To the ...

30

12

366

Christian Szegedy

@ChrSzegedy

2 years

I would bet that a 500+ bn parameter transformer trained on multimodal web data can solve these problems even today. I offer a public bet to anyone that such a demonstration will happen within two years, using today's transformer architecture exploiting few-shot generalization.

Melanie Mitchell

@MelMitchell1

2 years

But it's worth remembering the Bongard problems, created by a Russian computer scientist in the 1960s as a challenge to AI. These problems require a rich and general understanding of basic concepts such as "same" vs. "different". (2/8)

11

51

382

18

39

362

Christian Szegedy

@ChrSzegedy

4 years

Based on GPT-3, my impression that the real danger of *current* AI is not that it will impose its will on us, but that it is like a chameleon: Telling whatever each of us wants to hear, regardless of its being crazy or meaningless. A perfect way to enhance our bubbles.

17

55

356

Christian Szegedy

@ChrSzegedy

10 months

Nah, it was Schmidhuber

Stephen McAleer

@McaleerStephen

10 months

We invented Q* first Glad openai is building on top of our idea

61

287

3K

8

23

318

Christian Szegedy

@ChrSzegedy

3 years

I really envy todays's kids and youth people for the fantastic learning opportunities. In the 1st grade the teacher asked the kids what was their favorite number. My six year old son's answer was aleph-null. I never taught that to him, he picked it op on youtube.

Sheldon Axler

@AxlerLinear

3 years

Today the videos that I made to accompany my book Linear Algebra Done Right surpassed two million minutes of total viewing on YouTube. Those videos are freely available from the links at . #LinearAlgebra

37

531

3K

9

22

317

Christian Szegedy

@ChrSzegedy

3 years

Our paper "Memorizing Transformers" was accepted at ICLR. This is a new twist on training retrieval augmented language models. This work presents gains even by using a relatively small memory that stores previous hidden states seen by the model.

Memorizing Transformers

Language models typically need to be trained or finetuned in order to acquire new knowledge, which involves updating their weights. We instead envision language models that can simply read and...

openreview.net

2

61

321

Christian Szegedy

@ChrSzegedy

3 years

Many ML researchers assume and often state in abstracts that self attention *needs* quadratic memory. It does not.

Self-attention Does Not Need $O(n^2)$ Memory

We present a very simple algorithm for attention that requires $O(1)$ memory with respect to sequence length and an extension to self-attention that requires $O(\log n)$ memory. This is in...

arxiv.org

8

39

314

Christian Szegedy

@ChrSzegedy

11 months

Everybody gets what they need: * the Kenyans: water * the Kenyan gov: criticism * YouTube: engagement * MrBeast: free publicity and attention * Activists: outrage prn

Yahoo News

@YahooNews

11 months

While American YouTuber MrBeast’s goal was to provide clean drinking water for 500,000 people, activists say his actions shamed the Kenyan government and helped perpetuate the stereotype that Africa is "dependent on handouts."

5K

1K

9K

17

32

299

Christian Szegedy

@ChrSzegedy

3 years

There is an interesting subtle mathematical detail about the ratio of vaccinated covid hospitalizations that goes way beyond the application of Bayes' theorem. Also, the same issue makes it hard to evaluate the performance of deployed machine learning systems. 1/n

6

43

294

Christian Szegedy

@ChrSzegedy

10 months

I am not sure what Richard means here exactly. Even if we consider supervised training only, it could still achieve superhuman breadth and consistency. If we allow for RL and self-play, then AlphaZero is a clear counterexample.

Richard Socher

@RichardSocher

10 months

Proposed Theorem: it’s impossible to acquire super humanity skills when relying purely on data created by humanity. Why do I say super humanity and not super human? Because for example a translation algorithm is already better than any single human in terms of how many

64

30

345

24

9

254

Christian Szegedy

@ChrSzegedy

5 years

Approximate mathematical reasoning is possible in the latent space alone. We created semantic embedding of formulas and performed complicated multi-step reasoning on them, then we compared it with the symbolic results:

4

77

287

Christian Szegedy

@ChrSzegedy

1 month

Yesterday, i have spent 4 hours with my 13 year old son and grok to code up some multiplayer online game. We had a lot of fun and learned a lot. Grok has created skeletons, filled in todos, did code reviews, suggested libraries, etc.

Elon Musk

@elonmusk

1 month

Grok can code very well

2K

3K

22K

11

26

252

Christian Szegedy

@ChrSzegedy

27 days

Today, it would cost you about $900 worth of HDD storage to store your waking moments for a year as an iPhone 4K/30FPS HEVC video stream (~16 hrs/day).

19

262

Christian Szegedy

@ChrSzegedy

11 months

Inception used 1.5X less compute than AlexNet and 12X less than VGG, outperforming both. The trend continued with mobile net... etc. IMO, today's LLMs are insanely inefficient/compute. Regulations that impose limits on the amount of compute spent on AI training will just

14

21

260

Christian Szegedy

@ChrSzegedy

5 years

Mathematical Reasoning in Latent Space will be featured at #iclr2020 . Multiple step reasoning can be performed on embedding vectors of mathematical formulas.

Mathematical Reasoning in Latent Space

Learning to reason about higher order logic formulas in the latent space.

openreview.net

0

59

248

Christian Szegedy

@ChrSzegedy

5 years

The best explanation of transformer models I have ever seen.

Peter Bloem (@[email protected])

@pbloemesquire

5 years

New blogpost! Transformers from scratch. Modern transformers are super simple, so we can explain them in a really straightforward manner. Includes pytorch code.

17

453

2K

1

54

239

Christian Szegedy

@ChrSzegedy

2 months

Self-revision:

22

28

210

Christian Szegedy

@ChrSzegedy

3 years

Thanks a lot to @Yuhu_ai_ , @MarkusNRabe and Delesley Hutchins for their hard work of updating our ICLR paper on retrieval-augmented language modeling, aka "Memorizing Transformer"! Here is a short thread on why we think this is important. 🧵 1/n

Memorizing Transformers

Language models typically need to be trained or finetuned in order to acquire new knowledge, which involves updating their weights. We instead envision language models that can simply read and...

openreview.net

4

41

233

Christian Szegedy

@ChrSzegedy

11 months

13

18

215

Christian Szegedy

@ChrSzegedy

2 years

Achievement unlocked: ML tweet liked by both @ylecun and @GaryMarcus :)

10

1

223

Christian Szegedy

@ChrSzegedy

11 months

A mathematician is a person who can find analogies between theorems; a better mathematician is one who can see analogies between proofs and the best mathematician can notice analogies between theories. One can imagine that the ultimate mathematician is one who can see analogies

Amir Yazdanbakhsh

@ayazdanb

11 months

@ChrSzegedy @DimitrisPapail Humans are good in connecting past knowledge/experiences with current situation. I think we need much work on learning abstractions, or maybe LLMs learn such abstractions already, but not sure about systematic evaluation of modles creating abstractions.

0

7

15

26

213

Christian Szegedy

@ChrSzegedy

5 months

Thanks a lot Ian, greatly appreciated! This is the first result of my 12 years of ML carrier and the one I am still most proud of, on the other hand I found this paper did not do justice to it. The "lessons learned" slide at the end of the presentation might be worth checking

Ian Goodfellow

@goodfellow_ian

5 months

Congratulations to @ChrSzegedy on the test of time award for discovering adversarial examples! Christian actually first told Yoshua, others at LISA lab, and me privately about adversarial examples at NeurIPS 2012

7

15

381

5

11

216

Christian Szegedy

@ChrSzegedy

2 years

It seems like an excellent proposal, but I think Yann did not perform a proper segmentation of the US market. After careful market research, I came to the conclusion that 90%:American market can be covered by two AGIs: Model C and Model L, depicted by the provided sketches.

Yann LeCun

@ylecun

2 years

My position/vision/proposal paper is finally available: "A Path Towards Autonomous Machine Intelligence" It is available on (not arXiv for now) so that people can post reviews, comments, and critiques: 1/N

80

943

4K

8

20

214

Christian Szegedy

@ChrSzegedy

1 year

I am not surprised at all. We have used the exact same loss function in our 2019 . We have tested softmax as well, but did not find much gains. Btw, instead of large batch sizes, we used semi-hard negative mining.

Graph Representations for Higher-Order Logic and Theorem Proving

This paper presents the first use of graph neural networks (GNNs) for higher-order proof search and demonstrates that GNNs can improve upon state-of-the-art results in this domain. Interactive,...

arxiv.org

Lucas Beyer (bl16)

@giffmana

1 year

What makes CLIP work? The contrast with negatives via softmax? The more negatives, the better -> large batch-size? We'll answer "no" to both in our ICCV oral🤓 By introducing SigLIP, a simpler CLIP that also works better and is more scalable, we can study the extremes. Hop in🧶

26

294

2K

4

15

209

Christian Szegedy

@ChrSzegedy

5 months

Relax people. We might see some AI generating full movies, being better at math than a Fields medalist, driving car safer than a professional driver, but I certainly don't expect an AI being able to do *everything* better than a human within four years.

AI Notkilleveryoneism Memes ⏸️

@AISafetyMemes

5 months

OpenAI co-founder: AGI could be 2-3 years away. Humanity may need to pause. DWARKESH PATEL: So what's the plan? If there's no other bottlenecks, AGI next year or something? JOHN SCHULMAN: So, first of all, I don't think this is going to happen next year, but it's still useful

47

71

399

27

9

180

Christian Szegedy

@ChrSzegedy

9 months

Moving at light speed

26

0

125

Christian Szegedy

@ChrSzegedy

8 months

This completely contradicts my experiences. It feels like Francois and I are living in alternate realities.

François Chollet

@fchollet

8 months

The "aha" moment when I realized that curve-fitting was the wrong paradigm for achieving generalizable modeling of problems spaces that involve symbolic reasoning was in early 2016. I was trying every possible way to get a LSTM/GRU based model to classify first-order logic

44

213

2K

27

12

162

Christian Szegedy

@ChrSzegedy

1 year

IMO deep learning is the exact opposite of alchemy: Alchemy was performed by people with magical mind set hoping (in vain) that it would work. Deep Learning was developed by people with scientific mind set who are still very reluctant to accept that it works this well.

Melanie Mitchell

@MelMitchell1

1 year

Tired: Pseudoscience Wired: Alchemy!

20

58

309

16

18

194

Christian Szegedy

@ChrSzegedy

5 years

@bloodsigns @JeremyKonyndyk I think I would be fine a with any random person as president, now... :(

9

1

193

Christian Szegedy

@ChrSzegedy

4 years

One amazing, underrated fact about transformers is they are capable of figuring out the spatial structure of data without a built in architectural inductive bias. I have not tried, but I'd bet that a transformer on *permutation-invariant* MNIST can beat a ConvNet using 2D input.

15

11

199

Christian Szegedy

@ChrSzegedy

7 months

I can't wait to see AlphaZero-0K, the AI that beats you with weakest possible moves in order to minimize the amount of training data one can extract from its moves.

30

14

172

Christian Szegedy

@ChrSzegedy

3 years

What are your favorite science/eng/ math/history etc infotainment youtube channels? Here are a few I enjoyed (in random order): 3blueonebrown, Cool Worlds , Anton Petrov, Veritasium, @MLStreetTalk , Atlas Pro, RealLifeLore, Invicta, ReligionForBreakfast, CaspianReport, SciShow

Alex Kontorovich

@AlexKontorovich

3 years

What the hell?! #9 on youtube? Guys, this is a *math* video. Pure math. ??! Way to go @veritasium and @SciencePetr !

10

11

390

25

20

196

Christian Szegedy

@ChrSzegedy

2 years

I am happy to have a long bet with anyone including @MelMitchell1 or @GaryMarcus on the formalization + theorem proving capabilities of AIs by 2029. I am fairly confident that we will have a system with comparable or stronger capabilities to/than strong human mathematicians.

Joscha Bach

@Plinz

2 years

I know less about the sota in modeling math problems, but natural language parsing of school and undergrad math problems into solvers is already beginning to work, and I don't really expect it to hit any walls before 2029.

2

1

28

24

22

194

Christian Szegedy

@ChrSzegedy

8 months

That's why i think (and have been saying for years) that software generation without verification is like building on sand. The more we rely on AI-generated code, the more sophisticated verification we will need. In the limit: formal guarantees.

New GitHub Copilot Research Finds 'Downward Pressure on Code Quality' -- Visual Studio Magazine

'We find disconcerting trends for maintainability.'

visualstudiomagazine.com

28

22

170

Christian Szegedy

@ChrSzegedy

11 months

IMO the most impactful next questions in ML for AI: - New ways to train keys for retrieval in a large data-base without supervision (by relevance only). - Diffusion for discrete data/in latent space - Synthetic data generation - Fully learned optimizers

davidad 🎇

@davidad

11 months

@geoffreyhinton @ChrSzegedy @dpkingma @ylecun It wouldn’t surprise me if, in one of those conversations, as an example of how embarrassingly simple the next insight might be, someone might have said “like you know maybe somehow logistic regression instead of linear” and then everyone just laughed.

2

3

79

10

31

190

Christian Szegedy

@ChrSzegedy

1 year

Alfredo Canziani

@alfcnz

1 year

Met one more Deep Learning celebrity, this time on the dance floor! Thanks, @ChrSzegedy , for the two dances! I had a blast! 🥳🥳🥳

1

3

60

4

5

189

Christian Szegedy

@ChrSzegedy

4 years

Slide from one of my first presentations on adversarial examples.

4

11

188

Christian Szegedy

@ChrSzegedy

2 years

I also found (back in 2018, unpublished) that one can share the weights of all layers in convnet if one does not share the BN params. The quality loss om ImageNet (by doing so) was marginal (<0.5% top-1), training speed about the same, but huge reduction of parameter count.

Dimitris Papailiopoulos

@DimitrisPapail

2 years

"The Expressive Power of Tuning Only the Norm Layers" lead by @AngelikiGiannou & @shashank_r12 We show that large frozen networks maintain expressivity even if we only fine-tune the norm & bias layers.

5

40

263

11

16

187

Christian Szegedy

@ChrSzegedy

1 year

Can you spot the transformer?

Elon Musk

@elonmusk

1 year

Practically invisible

24K

22K

387K

18

11

165

Christian Szegedy

@ChrSzegedy

3 years

My brother Balazs proved some theorem about the cyclotomic polynomial and asked several mathematicians whether it was known, he was suggested that he asks Erdos (who happened to be in Budapest for his 80th birthday conference). He got an appointment from Erdos at 6am. 1/2

Fermat's Library

@fermatslibrary

3 years

Erdös published more than 1,500 papers and did mathematics 19 hours a day, even at 83 - "The 1st sign of senility is when a man forgets his theorems. The 2ns sign is when he forgets to zip up. The 3rd sign is when he forgets to zip down"

21

419

3K

2

20

179

Christian Szegedy

@ChrSzegedy

10 months

2

8

41

Christian Szegedy

@ChrSzegedy

4 years

Took me a few years to learn the domain transfer.

Andrei Munteanu

@andrei_mntn

4 years

Can confirm this is accurate

27

163

2K

5

8

181

Christian Szegedy

@ChrSzegedy

4 years

The real power of transformers: all inductive biases are learned from data alone. In other words: architecture-learning is automated.

5

11

180

Christian Szegedy

@ChrSzegedy

3 years

🤔🤔🤔 I'd challenge anyone to even remotely match the efficiency of a few dozen lines of python JAX matrix-calculation code on modern accelerators with any fortran code.

Prof Roberto Trotta

@R_Trotta

4 years

"Educators may want to reconsider teaching Python to University students. There are plenty environmentally friendly alternatives." 🤔 Python is the more CO2-intensive and less efficient of the languages in astronomy, argues

109

195

509

11

16

173

Christian Szegedy

@ChrSzegedy

2 years

Soon on arxiv: Title: "Large Language Models Are Zero Shot Lovers" Abstract: "Prompt: Step by step, gonna get to you, girl."

3

8

168

Christian Szegedy

@ChrSzegedy

10 months

Looks like IBM already had transformers in 1966

5

6

70

Christian Szegedy

@ChrSzegedy

2 months

Column r us

xAI

@xai

2 months

2K

9K

9

13

145

Christian Szegedy

@ChrSzegedy

4 months

I think the arguments in this article only support that LLMs don't have a human-like sentience. It does not rule out that they have some different kind: maybe LLMs feel some "satistfaction" when it produces the sentence "I feel terrible", as it enjoys predicting text.

Fei-Fei Li

@drfeifei

4 months

Is AI sentient? My friend and colleague Prof. John Etchemendy, a renowned professor and co-Director of @StanfordHAI , just co-authored this piece to debunk the claim that today’s LLMs are sentient @TIME

140

226

869

43

6

135

Christian Szegedy

@ChrSzegedy

1 year

Have a go at it!

Yuhuai (Tony) Wu

@Yuhu_ai_

1 year

Solve math and understand the universe

159

153

1K

10

5

154

Christian Szegedy

@ChrSzegedy

5 months

Thanks Been. The thing I feel most privileged about is the amazing people I had the opportunity to work and talk with from day one of my ML journey. This also includes my first ML mentor and collaborator Hartmut Neven, who should have been on the paper as he encouraged this

Been Kim

@_beenkim

5 months

Congratulations for the runner up of the test of time award at #ICLR2024 ! @ChrSzegedy @woj_zaremba @ilyasut @joanbruna @doomie @goodfellow_ian @rob_fergus !

2

9

121

11

7

159

Christian Szegedy

@ChrSzegedy

8 months

"AI won't do X until we don't understand Y" This template has been used in many arguments and has failed in countless scenarios already.

Prof. Lee Cronin

@leecronin

8 months

The scientific method will not be automated until we understand how to build universal explainers.

36

25

178

27

6

136

Christian Szegedy

@ChrSzegedy

4 years

Interesting paper: 94% on cifar-10 with 80 labeled examples (8/class).

Li Junnan

@LiJunnan0409

4 years

Excited to introduce CoMatch, our new semi-supervised learning method! CoMatch jointly learns class probability and image representation with graph-based contrastive learning. @CaimingXiong @stevenhoi Blog: Paper:

4

88

349

1

30

156

Christian Szegedy

@ChrSzegedy

2 years

It is the rate of progress that matters. A few years ago, neural networks could generate some random crappy bedrooms and creepy looking faces. Now we have Dall-e, ImaGen and stable diffusion. Half year ago LLMs solving high school level math problems seemed impossible.

Kevin Buzzard

@XenaProject

2 years

@certifiablyrand I'm just saying what I personally believe. I am not an AI expert. I've just spent a week at an AI conference and I see language models answering questions which we give 16 year olds in the UK, but this is state of the art. People like @ChrSzegedy are more optimistic.

0

6

2

154

Christian Szegedy

@ChrSzegedy

4 years

@zacharylipton One thing is the a lot of people conflate BERT with transformers. Transformers have been the real idea. BERT is just a variation on denoising auto-encoding, ELMO, etc. BERT without transformer is ~ELMO. Transformers without BERT is GPT.

3

18

155

Christian Szegedy

@ChrSzegedy

3 months

My brother's latest preprint: generalizing higher order Fourier analysis to noncommutative groups:

A higher-order generalization of group theory

The goal of this paper is to show that fundamental concepts in higher-order Fourier analysis can be nauturally extended to the non-commutative setting. We generalize Gowers norms to arbitrary...

arxiv.org

3

19

108

Christian Szegedy

@ChrSzegedy

8 months

I am very skeptical that there is an easy solution to utilizing and learning long context. Of course, one can utilize a huge context with the right supervision, but it is not the same as evolving a metric at a large scale and discovering new, hidden connections. The latter is

Delip Rao e/σ

@deliprao

8 months

IDK. I am unreasonably excited about a *working* 10M context length than being able to generate a video short. In terms of broad product impact, an API for the former is incomparable.

26

20

285

6

7

104

Christian Szegedy

@ChrSzegedy

2 years

What I can envision that our kids in 5 years will able to direct whole movies by interactively working with AIs to create story-board, concept art, artificial actor profiles... using automated tools, within a few days, mostly by just running prompts and selecting what they liked.

hardmaru

@hardmaru

2 years

Soon, the world will be flooded with text-to-tiktok videos 🙃

10

37

311

11

21

150

Christian Szegedy

@ChrSzegedy

2 years

@fchollet Python is elegant, but its syntax, libraries, tooling and philosophy are the exceptions.

7

6

150

Christian Szegedy

@ChrSzegedy

11 months

Interpretability breakthrough

anton (𝔴𝔞𝔯𝔱𝔦𝔪𝔢)

@atroyn

11 months

if you’re confused about how large language models work this new animation from google deepmind should clear a lot of things up

195

339

4K

6

21

144

Christian Szegedy

@ChrSzegedy

7 months

Closed scientific publishing is a scam for multiple reasons.

Lawrence Paulson

@LawrPaulson

7 months

Study finds that we could lose science if publishers go bankrupt | Ars Technica

2

11

42

13

115

Christian Szegedy

@ChrSzegedy

11 months

Per popular request, Grok ME will have a "Gary mode" button that will make it spit out blank-faced corporate pablum and self-contradictory takes on AI. ;) /s

Gary Marcus

@GaryMarcus

11 months

Making a bug into a feature, with Spin™

0

12

9

3

139

Christian Szegedy

@ChrSzegedy

3 years

Hmmm.... I always thought that the purpose of memes is to get copied as much as possible and go viral.

4

2

145

Christian Szegedy

@ChrSzegedy

4 months

Absolutely. I am convinced that autoformalization (computer code/formal-math-synthesis from natural language) is the way ahead for AI.

François Chollet

@fchollet

4 months

I believe that program synthesis will solve reasoning. And I believe that deep learning will solve program synthesis (by guiding a discrete program search process). But I don't think you can go all that far with just prompting a LLM to generate end-to-end Python programs (even

48

83

862

11

130

Christian Szegedy

@ChrSzegedy

5 years

Anybody studying adversarial examples should read this document. It is not just a good read, but an interesting new format for serious scientific discourse.

Distill

@distillpub

5 years

A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features' - Six comments from the community and responses from the original authors.

6

200

614

0

26

143

Christian Szegedy

@ChrSzegedy

6 months

A possible solution to "the great silence": any sufficiently compressed communication is indistinguishable from white noise.

27

9

121

Christian Szegedy

@ChrSzegedy

24 days

@soniajoseph_ AI might be shockingly simple at a conceptual level, but for many people it is extremely hard to acquire its mindset, which is quite indirect, not like traditional programming or even performing some complicated routine. Any tiny algorithmic/data processing change is super

3

1

141

Christian Szegedy

@ChrSzegedy

10 months

3

4

60

Christian Szegedy

@ChrSzegedy

4 years

It was my great pleasure talking about my vision of the near term future of AI, current research directions, personal perspectives on recent DL history and some of my views on potential societal impact of AI and related technologies.

Machine Learning Street Talk

@MLStreetTalk

4 years

We discuss formal reasoning, software synthesis and transformers with the one and only @ChrSzegedy from @GoogleAI - a pioneer in the DL field. Could we create a system which will act like a super human mathematician? @ykilcher @MSalvaris @ecsquendor

4

25

82

2

21

140

Christian Szegedy

@ChrSzegedy

5 years

Great paper by Andras Rozsa on networks robust to adversarial attacks using the "tent activation" with batchnorm: Similar or better robustness than adversarial training without the extra training cost, reducing the "open space risk".

6

39

138

Christian Szegedy

@ChrSzegedy

5 years

Interpreting neural networks. A thorough study. Very fascinating read. How complex structures emerge from SGD and data alone.

OpenAI

@OpenAI

5 years

We show how to read a human-interpretable algorithm from a neural network's weights, by studying the circuits formed by the connections between individual neurons:

20

447

1K

0

22

138

Christian Szegedy

@ChrSzegedy

10 months

Today's AI (credit: Balazs Szegedy using DALL-E-3)

5

14

65

Christian Szegedy

@ChrSzegedy

3 years

Crucifixion of Darth Vader, impressionist style

5

11

134

Christian Szegedy

@ChrSzegedy

10 months

While I agree with the main point, I don't think that AI will ever be as smart as a cat or dolphin. No cat or dolphin knows the internet by heart or can pass math exams using pure instinct. I still think that making superhuman AI is a collaborative/competitive gradual process.

Yann LeCun

@ylecun

10 months

The emergence of superhuman AI will not be an event. Progress is going to be progressive. It will start with systems that can learn how the world works, like baby animals. Then we'll have machines that are objective driven and that satisfy guardrails. Then, we'll have machines

379

489

3K

4

6

45

Christian Szegedy

@ChrSzegedy

4 years

IMO transformers have improved by a large margin.

Samarth Sinha

@_sam_sinha_

4 years

I miss the older days

14

210

2K

5

13

133