Christian Szegedy Profile
Christian Szegedy

@ChrSzegedy

Followers
34,398
Following
2,446
Media
235
Statuses
7,544

#deeplearning , #ai research scientist. Opinions are mine.

Sunnyvale, CA
Joined June 2015
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@ChrSzegedy
Christian Szegedy
2 years
Deep learning is not just memorization, it also compresses. And really good compression amounts to intelligence. That's why we get few/zero-shot capabilities from LLMs, since they "discovered" a lot of the underlying patterns just by being trained to compress.
@fchollet
François Chollet
2 years
Deep learning takes data points and turns them into a query-able structure that enables retrieval and interpolation between the points. You could think of it as a continuous generalization of database technology.
39
467
3K
33
98
912
@ChrSzegedy
Christian Szegedy
11 months
"Not too bad from a bunch of monkeys"
@elonmusk
Elon Musk
11 months
For the first time, there is a rocket that can make all life multiplanetary. A fork in the road of human destiny.
Tweet media one
18K
31K
356K
32
35
883
@ChrSzegedy
Christian Szegedy
2 years
New jobs in the 21st century: Model restart specialist Hyperparameter psychic Prompt engineer Model janitor Tensor shape mediator Quantum state observer Model footprint accountant
@awnihannun
Awni Hannun
2 years
The paper mentions 35 (!) manual restarts to train OPT-175B due to hardware failure (and 70+ automatic restarts).
2
11
86
20
97
715
@ChrSzegedy
Christian Szegedy
11 months
I talked to several non-ML people over the past half year (including my kids) about chat bots and what i feel is a general frustration about their being moralizing and not getting answers even to relatively innocent prompts. So being more unhinged will be a feature not a bug.
@BillyM2k
Shibetoshi Nakamoto
11 months
i predict after the xAI software grok comes out, there will be an excessive amount of articles about how sexist / racist / hate speech / blah blah blah it is, by people financially and socially motivated to make articles like that
206
76
1K
26
29
648
@ChrSzegedy
Christian Szegedy
2 years
User: "When was Einstein Born?" LLM: .. Let's revisit a compressed form of all existing knowledge on the web ... once for *every* single input token ... and each token generated, including the GDP of Armenia, the name of all kings ever lived and all famous chess games, etc ...
27
32
641
@ChrSzegedy
Christian Szegedy
3 years
Here is a thread in paper references demonstrating that current deep learning, especially transformers are amazingly powerful for symbol manipulation already: @GuillaumeLample , @f_charton Solving hard integrals using deep transformers 🧵1/n
@Plinz
Joscha Bach
3 years
"Deep Learning is hitting a Wall" — Gary Marcus
Tweet media one
272
460
3K
11
127
570
@ChrSzegedy
Christian Szegedy
28 days
The most sincere from of flattery is when something seems so outrageous that most (knowledgable) people don't even believe it.
@theinformation
The Information
28 days
AI Agenda: Why Musk’s AI Rivals Are Alarmed by His New GPU Cluster Elon Musk claims to have finished a 100,000-strong H100 cluster in four months. How likely is that? From @anissagardizy8
12
20
91
15
37
562
@ChrSzegedy
Christian Szegedy
28 days
My view on this has not changed in the past eight years: I have given many talks and written position paper in 2019 (link below). Progress is faster than my past expectation. My target date used to be ~2029 back then. Now it is 2026 for a superhuman AI mathematician. While a
@GaryMarcus
Gary Marcus
28 days
@ChrSzegedy @colin_fraser @Lang__Leon I am less sanguine but would love to hear your thoughts in more detail if you can share. @ChrSzegedy
1
0
5
19
75
548
@ChrSzegedy
Christian Szegedy
11 months
I wonder what Sam's Grok thinks of ours... ;)
Tweet media one
29
25
510
@ChrSzegedy
Christian Szegedy
11 months
I think Yann might underestimate the potential of AI if people have API access to strong generative AI. LLMs are capable of generating code which could be executed *automatically* by *anyone* without any human *oversight*, also in a loop and open-endedly. This is very hard to
@ylecun
Yann LeCun
11 months
@geoffreyhinton One thing we know is that if future AI systems are built on the same blueprint as current Auto-Regressive LLMs, they may become highly knowledgeable but they will still be dumb. They will still hallucinate, they will still be difficult to control, and they will still merely
168
292
2K
42
62
467
@ChrSzegedy
Christian Szegedy
11 months
Amazing what a very small team can do in a few months.
@ibab
ibab
11 months
We've released our first progress update at xAI.
76
104
1K
18
23
287
@ChrSzegedy
Christian Szegedy
9 months
Happy holidays!
Tweet media one
14
30
229
@ChrSzegedy
Christian Szegedy
10 months
It is a bit ironic since both Yann and Geoff made their carreers by ignoring most of the "expert opinions" on the uselessness of deep learning for several decades.
@geoffreyhinton
Geoffrey Hinton
10 months
Yann LeCun thinks the risk of AI taking over is miniscule. This means he puts a big weight on his own opinion and a miniscule weight on the opinions of many other equally qualified experts.
652
505
4K
13
25
373
@ChrSzegedy
Christian Szegedy
11 months
Touche... :)
Tweet media one
22
13
428
@ChrSzegedy
Christian Szegedy
8 months
In the past few years (since 2016) i got increasingly convinced that retrieval augmanted generation is the most central problem of strong general AI. Despite the amazing work done in the past 7 years, it is still the most central question. Just look at what science is about:
36
43
405
@ChrSzegedy
Christian Szegedy
1 year
2013: solving atari 2016: mastering go 2019: mastering starcraft 2022: vastly improved protein folding 2023: stating the obvious
16
26
378
@ChrSzegedy
Christian Szegedy
1 year
To the ...
Tweet media one
30
12
366
@ChrSzegedy
Christian Szegedy
2 years
I would bet that a 500+ bn parameter transformer trained on multimodal web data can solve these problems even today. I offer a public bet to anyone that such a demonstration will happen within two years, using today's transformer architecture exploiting few-shot generalization.
@MelMitchell1
Melanie Mitchell
2 years
But it's worth remembering the Bongard problems, created by a Russian computer scientist in the 1960s as a challenge to AI. These problems require a rich and general understanding of basic concepts such as "same" vs. "different". (2/8)
Tweet media one
11
51
382
18
39
362
@ChrSzegedy
Christian Szegedy
4 years
Based on GPT-3, my impression that the real danger of *current* AI is not that it will impose its will on us, but that it is like a chameleon: Telling whatever each of us wants to hear, regardless of its being crazy or meaningless. A perfect way to enhance our bubbles.
17
55
356
@ChrSzegedy
Christian Szegedy
10 months
Nah, it was Schmidhuber
@McaleerStephen
Stephen McAleer
10 months
We invented Q* first Glad openai is building on top of our idea
Tweet media one
61
287
3K
8
23
318
@ChrSzegedy
Christian Szegedy
3 years
I really envy todays's kids and youth people for the fantastic learning opportunities. In the 1st grade the teacher asked the kids what was their favorite number. My six year old son's answer was aleph-null. I never taught that to him, he picked it op on youtube.
@AxlerLinear
Sheldon Axler
3 years
Today the videos that I made to accompany my book Linear Algebra Done Right surpassed two million minutes of total viewing on YouTube. Those videos are freely available from the links at . #LinearAlgebra
Tweet media one
37
531
3K
9
22
317
@ChrSzegedy
Christian Szegedy
3 years
Our paper "Memorizing Transformers" was accepted at ICLR. This is a new twist on training retrieval augmented language models. This work presents gains even by using a relatively small memory that stores previous hidden states seen by the model.
2
61
321
@ChrSzegedy
Christian Szegedy
11 months
Everybody gets what they need: * the Kenyans: water * the Kenyan gov: criticism * YouTube: engagement * MrBeast: free publicity and attention * Activists: outrage prn
@YahooNews
Yahoo News
11 months
While American YouTuber MrBeast’s goal was to provide clean drinking water for 500,000 people, activists say his actions shamed the Kenyan government and helped perpetuate the stereotype that Africa is "dependent on handouts."
Tweet media one
5K
1K
9K
17
32
299
@ChrSzegedy
Christian Szegedy
3 years
There is an interesting subtle mathematical detail about the ratio of vaccinated covid hospitalizations that goes way beyond the application of Bayes' theorem. Also, the same issue makes it hard to evaluate the performance of deployed machine learning systems. 1/n
6
43
294
@ChrSzegedy
Christian Szegedy
10 months
I am not sure what Richard means here exactly. Even if we consider supervised training only, it could still achieve superhuman breadth and consistency. If we allow for RL and self-play, then AlphaZero is a clear counterexample.
@RichardSocher
Richard Socher
10 months
Proposed Theorem: it’s impossible to acquire super humanity skills when relying purely on data created by humanity. Why do I say super humanity and not super human? Because for example a translation algorithm is already better than any single human in terms of how many
64
30
345
24
9
254
@ChrSzegedy
Christian Szegedy
5 years
Approximate mathematical reasoning is possible in the latent space alone. We created semantic embedding of formulas and performed complicated multi-step reasoning on them, then we compared it with the symbolic results:
4
77
287
@ChrSzegedy
Christian Szegedy
1 month
Yesterday, i have spent 4 hours with my 13 year old son and grok to code up some multiplayer online game. We had a lot of fun and learned a lot. Grok has created skeletons, filled in todos, did code reviews, suggested libraries, etc.
@elonmusk
Elon Musk
1 month
Grok can code very well
2K
3K
22K
11
26
252
@ChrSzegedy
Christian Szegedy
27 days
Today, it would cost you about $900 worth of HDD storage to store your waking moments for a year as an iPhone 4K/30FPS HEVC video stream (~16 hrs/day).
19
19
262
@ChrSzegedy
Christian Szegedy
11 months
Inception used 1.5X less compute than AlexNet and 12X less than VGG, outperforming both. The trend continued with mobile net... etc. IMO, today's LLMs are insanely inefficient/compute. Regulations that impose limits on the amount of compute spent on AI training will just
14
21
260
@ChrSzegedy
Christian Szegedy
5 years
Mathematical Reasoning in Latent Space will be featured at #iclr2020 . Multiple step reasoning can be performed on embedding vectors of mathematical formulas.
0
59
248
@ChrSzegedy
Christian Szegedy
5 years
The best explanation of transformer models I have ever seen.
New blogpost! Transformers from scratch. Modern transformers are super simple, so we can explain them in a really straightforward manner. Includes pytorch code.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
17
453
2K
1
54
239
@ChrSzegedy
Christian Szegedy
2 months
Self-revision:
22
28
210
@ChrSzegedy
Christian Szegedy
3 years
Thanks a lot to @Yuhu_ai_ , @MarkusNRabe and Delesley Hutchins for their hard work of updating our ICLR paper on retrieval-augmented language modeling, aka "Memorizing Transformer"! Here is a short thread on why we think this is important. 🧵 1/n
4
41
233
@ChrSzegedy
Christian Szegedy
11 months
Tweet media one
13
18
215
@ChrSzegedy
Christian Szegedy
2 years
Achievement unlocked: ML tweet liked by both @ylecun and @GaryMarcus :)
10
1
223
@ChrSzegedy
Christian Szegedy
11 months
A mathematician is a person who can find analogies between theorems; a better mathematician is one who can see analogies between proofs and the best mathematician can notice analogies between theories. One can imagine that the ultimate mathematician is one who can see analogies
@ayazdanb
Amir Yazdanbakhsh
11 months
@ChrSzegedy @DimitrisPapail Humans are good in connecting past knowledge/experiences with current situation. I think we need much work on learning abstractions, or maybe LLMs learn such abstractions already, but not sure about systematic evaluation of modles creating abstractions.
0
0
7
15
26
213
@ChrSzegedy
Christian Szegedy
5 months
Thanks a lot Ian, greatly appreciated! This is the first result of my 12 years of ML carrier and the one I am still most proud of, on the other hand I found this paper did not do justice to it. The "lessons learned" slide at the end of the presentation might be worth checking
@goodfellow_ian
Ian Goodfellow
5 months
Congratulations to @ChrSzegedy on the test of time award for discovering adversarial examples! Christian actually first told Yoshua, others at LISA lab, and me privately about adversarial examples at NeurIPS 2012
7
15
381
5
11
216
@ChrSzegedy
Christian Szegedy
2 years
It seems like an excellent proposal, but I think Yann did not perform a proper segmentation of the US market. After careful market research, I came to the conclusion that 90%:American market can be covered by two AGIs: Model C and Model L, depicted by the provided sketches.
Tweet media one
Tweet media two
@ylecun
Yann LeCun
2 years
My position/vision/proposal paper is finally available: "A Path Towards Autonomous Machine Intelligence" It is available on (not arXiv for now) so that people can post reviews, comments, and critiques: 1/N
Tweet media one
80
943
4K
8
20
214
@ChrSzegedy
Christian Szegedy
1 year
I am not surprised at all. We have used the exact same loss function in our 2019 . We have tested softmax as well, but did not find much gains. Btw, instead of large batch sizes, we used semi-hard negative mining.
@giffmana
Lucas Beyer (bl16)
1 year
What makes CLIP work? The contrast with negatives via softmax? The more negatives, the better -> large batch-size? We'll answer "no" to both in our ICCV oral🤓 By introducing SigLIP, a simpler CLIP that also works better and is more scalable, we can study the extremes. Hop in🧶
Tweet media one
26
294
2K
4
15
209
@ChrSzegedy
Christian Szegedy
5 months
Relax people. We might see some AI generating full movies, being better at math than a Fields medalist, driving car safer than a professional driver, but I certainly don't expect an AI being able to do *everything* better than a human within four years.
@AISafetyMemes
AI Notkilleveryoneism Memes ⏸️
5 months
OpenAI co-founder: AGI could be 2-3 years away. Humanity may need to pause. DWARKESH PATEL: So what's the plan? If there's no other bottlenecks, AGI next year or something? JOHN SCHULMAN: So, first of all, I don't think this is going to happen next year, but it's still useful
47
71
399
27
9
180
@ChrSzegedy
Christian Szegedy
9 months
Moving at light speed
Tweet media one
26
0
125
@ChrSzegedy
Christian Szegedy
8 months
This completely contradicts my experiences. It feels like Francois and I are living in alternate realities.
@fchollet
François Chollet
8 months
The "aha" moment when I realized that curve-fitting was the wrong paradigm for achieving generalizable modeling of problems spaces that involve symbolic reasoning was in early 2016. I was trying every possible way to get a LSTM/GRU based model to classify first-order logic
44
213
2K
27
12
162
@ChrSzegedy
Christian Szegedy
1 year
IMO deep learning is the exact opposite of alchemy: Alchemy was performed by people with magical mind set hoping (in vain) that it would work. Deep Learning was developed by people with scientific mind set who are still very reluctant to accept that it works this well.
@MelMitchell1
Melanie Mitchell
1 year
Tired: Pseudoscience Wired: Alchemy!
20
58
309
16
18
194
@ChrSzegedy
Christian Szegedy
5 years
@bloodsigns @JeremyKonyndyk I think I would be fine a with any random person as president, now... :(
9
1
193
@ChrSzegedy
Christian Szegedy
4 years
One amazing, underrated fact about transformers is they are capable of figuring out the spatial structure of data without a built in architectural inductive bias. I have not tried, but I'd bet that a transformer on *permutation-invariant* MNIST can beat a ConvNet using 2D input.
15
11
199
@ChrSzegedy
Christian Szegedy
7 months
I can't wait to see AlphaZero-0K, the AI that beats you with weakest possible moves in order to minimize the amount of training data one can extract from its moves.
30
14
172
@ChrSzegedy
Christian Szegedy
3 years
What are your favorite science/eng/ math/history etc infotainment youtube channels? Here are a few I enjoyed (in random order): 3blueonebrown, Cool Worlds , Anton Petrov, Veritasium, @MLStreetTalk , Atlas Pro, RealLifeLore, Invicta, ReligionForBreakfast, CaspianReport, SciShow
@AlexKontorovich
Alex Kontorovich
3 years
What the hell?! #9 on youtube? Guys, this is a *math* video. Pure math. ??! Way to go @veritasium and @SciencePetr !
Tweet media one
10
11
390
25
20
196
@ChrSzegedy
Christian Szegedy
2 years
I am happy to have a long bet with anyone including @MelMitchell1 or @GaryMarcus on the formalization + theorem proving capabilities of AIs by 2029. I am fairly confident that we will have a system with comparable or stronger capabilities to/than strong human mathematicians.
@Plinz
Joscha Bach
2 years
I know less about the sota in modeling math problems, but natural language parsing of school and undergrad math problems into solvers is already beginning to work, and I don't really expect it to hit any walls before 2029.
2
1
28
24
22
194
@ChrSzegedy
Christian Szegedy
8 months
That's why i think (and have been saying for years) that software generation without verification is like building on sand. The more we rely on AI-generated code, the more sophisticated verification we will need. In the limit: formal guarantees.
28
22
170
@ChrSzegedy
Christian Szegedy
11 months
IMO the most impactful next questions in ML for AI: - New ways to train keys for retrieval in a large data-base without supervision (by relevance only). - Diffusion for discrete data/in latent space - Synthetic data generation - Fully learned optimizers
@davidad
davidad 🎇
11 months
@geoffreyhinton @ChrSzegedy @dpkingma @ylecun It wouldn’t surprise me if, in one of those conversations, as an example of how embarrassingly simple the next insight might be, someone might have said “like you know maybe somehow logistic regression instead of linear” and then everyone just laughed.
2
3
79
10
31
190
@ChrSzegedy
Christian Szegedy
1 year
@alfcnz
Alfredo Canziani
1 year
Met one more Deep Learning celebrity, this time on the dance floor! Thanks, @ChrSzegedy , for the two dances! I had a blast! 🥳🥳🥳
Tweet media one
1
3
60
4
5
189
@ChrSzegedy
Christian Szegedy
4 years
Slide from one of my first presentations on adversarial examples.
Tweet media one
4
11
188
@ChrSzegedy
Christian Szegedy
2 years
I also found (back in 2018, unpublished) that one can share the weights of all layers in convnet if one does not share the BN params. The quality loss om ImageNet (by doing so) was marginal (<0.5% top-1), training speed about the same, but huge reduction of parameter count.
@DimitrisPapail
Dimitris Papailiopoulos
2 years
"The Expressive Power of Tuning Only the Norm Layers" lead by @AngelikiGiannou & @shashank_r12 We show that large frozen networks maintain expressivity even if we only fine-tune the norm & bias layers.
Tweet media one
5
40
263
11
16
187
@ChrSzegedy
Christian Szegedy
1 year
Can you spot the transformer?
@elonmusk
Elon Musk
1 year
Practically invisible
Tweet media one
24K
22K
387K
18
11
165
@ChrSzegedy
Christian Szegedy
3 years
My brother Balazs proved some theorem about the cyclotomic polynomial and asked several mathematicians whether it was known, he was suggested that he asks Erdos (who happened to be in Budapest for his 80th birthday conference). He got an appointment from Erdos at 6am. 1/2
@fermatslibrary
Fermat's Library
3 years
Erdös published more than 1,500 papers and did mathematics 19 hours a day, even at 83 - "The 1st sign of senility is when a man forgets his theorems. The 2ns sign is when he forgets to zip up. The 3rd sign is when he forgets to zip down"
Tweet media one
21
419
3K
2
20
179
@ChrSzegedy
Christian Szegedy
10 months
Tweet media one
2
8
41
@ChrSzegedy
Christian Szegedy
4 years
Took me a few years to learn the domain transfer.
@andrei_mntn
Andrei Munteanu
4 years
Can confirm this is accurate
Tweet media one
27
163
2K
5
8
181
@ChrSzegedy
Christian Szegedy
4 years
The real power of transformers: all inductive biases are learned from data alone. In other words: architecture-learning is automated.
5
11
180
@ChrSzegedy
Christian Szegedy
3 years
🤔🤔🤔 I'd challenge anyone to even remotely match the efficiency of a few dozen lines of python JAX matrix-calculation code on modern accelerators with any fortran code.
@R_Trotta
Prof Roberto Trotta
4 years
"Educators may want to reconsider teaching Python to University students. There are plenty environmentally friendly alternatives." 🤔 Python is the more CO2-intensive and less efficient of the languages in astronomy, argues
Tweet media one
109
195
509
11
16
173
@ChrSzegedy
Christian Szegedy
2 years
Soon on arxiv: Title: "Large Language Models Are Zero Shot Lovers" Abstract: "Prompt: Step by step, gonna get to you, girl."
3
8
168
@ChrSzegedy
Christian Szegedy
10 months
Looks like IBM already had transformers in 1966
Tweet media one
5
6
70
@ChrSzegedy
Christian Szegedy
2 months
Column r us
@xai
xAI
2 months
2K
2K
9K
9
13
145
@ChrSzegedy
Christian Szegedy
4 months
I think the arguments in this article only support that LLMs don't have a human-like sentience. It does not rule out that they have some different kind: maybe LLMs feel some "satistfaction" when it produces the sentence "I feel terrible", as it enjoys predicting text.
@drfeifei
Fei-Fei Li
4 months
Is AI sentient? My friend and colleague Prof. John Etchemendy, a renowned professor and co-Director of @StanfordHAI , just co-authored this piece to debunk the claim that today’s LLMs are sentient @TIME
140
226
869
43
6
135
@ChrSzegedy
Christian Szegedy
1 year
Have a go at it!
Tweet media one
@Yuhu_ai_
Yuhuai (Tony) Wu
1 year
Solve math and understand the universe
159
153
1K
10
5
154
@ChrSzegedy
Christian Szegedy
5 months
Thanks Been. The thing I feel most privileged about is the amazing people I had the opportunity to work and talk with from day one of my ML journey. This also includes my first ML mentor and collaborator Hartmut Neven, who should have been on the paper as he encouraged this
@_beenkim
Been Kim
5 months
Tweet media one
2
9
121
11
7
159
@ChrSzegedy
Christian Szegedy
8 months
"AI won't do X until we don't understand Y" This template has been used in many arguments and has failed in countless scenarios already.
@leecronin
Prof. Lee Cronin
8 months
The scientific method will not be automated until we understand how to build universal explainers.
36
25
178
27
6
136
@ChrSzegedy
Christian Szegedy
4 years
Interesting paper: 94% on cifar-10 with 80 labeled examples (8/class).
@LiJunnan0409
Li Junnan
4 years
Excited to introduce CoMatch, our new semi-supervised learning method! CoMatch jointly learns class probability and image representation with graph-based contrastive learning. @CaimingXiong @stevenhoi Blog: Paper:
4
88
349
1
30
156
@ChrSzegedy
Christian Szegedy
2 years
It is the rate of progress that matters. A few years ago, neural networks could generate some random crappy bedrooms and creepy looking faces. Now we have Dall-e, ImaGen and stable diffusion. Half year ago LLMs solving high school level math problems seemed impossible.
@XenaProject
Kevin Buzzard
2 years
@certifiablyrand I'm just saying what I personally believe. I am not an AI expert. I've just spent a week at an AI conference and I see language models answering questions which we give 16 year olds in the UK, but this is state of the art. People like @ChrSzegedy are more optimistic.
0
0
6
6
2
154
@ChrSzegedy
Christian Szegedy
4 years
@zacharylipton One thing is the a lot of people conflate BERT with transformers. Transformers have been the real idea. BERT is just a variation on denoising auto-encoding, ELMO, etc. BERT without transformer is ~ELMO. Transformers without BERT is GPT.
3
18
155
@ChrSzegedy
Christian Szegedy
8 months
I am very skeptical that there is an easy solution to utilizing and learning long context. Of course, one can utilize a huge context with the right supervision, but it is not the same as evolving a metric at a large scale and discovering new, hidden connections. The latter is
@deliprao
Delip Rao e/σ
8 months
IDK. I am unreasonably excited about a *working* 10M context length than being able to generate a video short. In terms of broad product impact, an API for the former is incomparable.
26
20
285
6
7
104
@ChrSzegedy
Christian Szegedy
2 years
What I can envision that our kids in 5 years will able to direct whole movies by interactively working with AIs to create story-board, concept art, artificial actor profiles... using automated tools, within a few days, mostly by just running prompts and selecting what they liked.
@hardmaru
hardmaru
2 years
Soon, the world will be flooded with text-to-tiktok videos 🙃
10
37
311
11
21
150
@ChrSzegedy
Christian Szegedy
2 years
@fchollet Python is elegant, but its syntax, libraries, tooling and philosophy are the exceptions.
7
6
150
@ChrSzegedy
Christian Szegedy
11 months
Interpretability breakthrough
@atroyn
anton (𝔴𝔞𝔯𝔱𝔦𝔪𝔢)
11 months
if you’re confused about how large language models work this new animation from google deepmind should clear a lot of things up
195
339
4K
6
21
144
@ChrSzegedy
Christian Szegedy
7 months
Closed scientific publishing is a scam for multiple reasons.
@LawrPaulson
Lawrence Paulson
7 months
Study finds that we could lose science if publishers go bankrupt | Ars Technica
2
2
11
42
13
115
@ChrSzegedy
Christian Szegedy
11 months
Per popular request, Grok ME will have a "Gary mode" button that will make it spit out blank-faced corporate pablum and self-contradictory takes on AI. ;) /s
@GaryMarcus
Gary Marcus
11 months
Making a bug into a feature, with Spin™
0
0
12
9
3
139
@ChrSzegedy
Christian Szegedy
3 years
Hmmm.... I always thought that the purpose of memes is to get copied as much as possible and go viral.
4
2
145
@ChrSzegedy
Christian Szegedy
4 months
Absolutely. I am convinced that autoformalization (computer code/formal-math-synthesis from natural language) is the way ahead for AI.
@fchollet
François Chollet
4 months
I believe that program synthesis will solve reasoning. And I believe that deep learning will solve program synthesis (by guiding a discrete program search process). But I don't think you can go all that far with just prompting a LLM to generate end-to-end Python programs (even
48
83
862
11
11
130
@ChrSzegedy
Christian Szegedy
5 years
Anybody studying adversarial examples should read this document. It is not just a good read, but an interesting new format for serious scientific discourse.
@distillpub
Distill
5 years
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features' - Six comments from the community and responses from the original authors.
6
200
614
0
26
143
@ChrSzegedy
Christian Szegedy
6 months
A possible solution to "the great silence": any sufficiently compressed communication is indistinguishable from white noise.
27
9
121
@ChrSzegedy
Christian Szegedy
24 days
@soniajoseph_ AI might be shockingly simple at a conceptual level, but for many people it is extremely hard to acquire its mindset, which is quite indirect, not like traditional programming or even performing some complicated routine. Any tiny algorithmic/data processing change is super
3
1
141
@ChrSzegedy
Christian Szegedy
10 months
Tweet media one
3
4
60
@ChrSzegedy
Christian Szegedy
4 years
It was my great pleasure talking about my vision of the near term future of AI, current research directions, personal perspectives on recent DL history and some of my views on potential societal impact of AI and related technologies.
@MLStreetTalk
Machine Learning Street Talk
4 years
We discuss formal reasoning, software synthesis and transformers with the one and only @ChrSzegedy from @GoogleAI - a pioneer in the DL field. Could we create a system which will act like a super human mathematician? @ykilcher @MSalvaris @ecsquendor
Tweet media one
4
25
82
2
21
140
@ChrSzegedy
Christian Szegedy
5 years
Great paper by Andras Rozsa on networks robust to adversarial attacks using the "tent activation" with batchnorm: Similar or better robustness than adversarial training without the extra training cost, reducing the "open space risk".
Tweet media one
6
39
138
@ChrSzegedy
Christian Szegedy
5 years
Interpreting neural networks. A thorough study. Very fascinating read. How complex structures emerge from SGD and data alone.
@OpenAI
OpenAI
5 years
We show how to read a human-interpretable algorithm from a neural network's weights, by studying the circuits formed by the connections between individual neurons:
Tweet media one
20
447
1K
0
22
138
@ChrSzegedy
Christian Szegedy
10 months
Today's AI (credit: Balazs Szegedy using DALL-E-3)
Tweet media one
5
14
65
@ChrSzegedy
Christian Szegedy
3 years
Crucifixion of Darth Vader, impressionist style
Tweet media one
5
11
134
@ChrSzegedy
Christian Szegedy
10 months
While I agree with the main point, I don't think that AI will ever be as smart as a cat or dolphin. No cat or dolphin knows the internet by heart or can pass math exams using pure instinct. I still think that making superhuman AI is a collaborative/competitive gradual process.
@ylecun
Yann LeCun
10 months
The emergence of superhuman AI will not be an event. Progress is going to be progressive. It will start with systems that can learn how the world works, like baby animals. Then we'll have machines that are objective driven and that satisfy guardrails. Then, we'll have machines
379
489
3K
4
6
45
@ChrSzegedy
Christian Szegedy
4 years
IMO transformers have improved by a large margin.
@_sam_sinha_
Samarth Sinha
4 years
I miss the older days
Tweet media one
14
210
2K
5
13
133