XY Han @XYHan_ profile

XY Han

@XYHan_

Followers

1,479

Following

1,131

Media

76

Statuses

1,440

Assistant Professor @ChicagoBooth | Papers: “Neural Collapse in Deep Nets” & “Survey Descent: Nonsmooth GD” | BSE @Princeton , MS @Stanford , PhD @Cornell

https://t.co/dbwjx7mUZI

Chicago, IL

Joined November 2011

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

LINGORM HOWE 12TH ANN • 1104293 Tweets

DeSantis • 457260 Tweets

60 Minutes • 448559 Tweets

#6284üuygula • 260137 Tweets

#istanbulconventionkeepsalive • 249409 Tweets

The View • 182556 Tweets

CONGRATS FOURTH • 174249 Tweets

#istanbulsözleşmesiyaşatır • 145323 Tweets

Iniesta • 142191 Tweets

Jets • 101199 Tweets

जम्मू कश्मीर • 89907 Tweets

Saleh • 77442 Tweets

#いれいす活動4周年 • 73930 Tweets

Burna • 70381 Tweets

Alexandra • 51619 Tweets

Rodgers • 51081 Tweets

Bob Woodward • 47501 Tweets

BBS RAIN • 38695 Tweets

#PortfolioDay • 36772 Tweets

BBS FYANG • 31359 Tweets

ITZY GOLD TEASER 1 • 26087 Tweets

에드워드 • 21695 Tweets

PERSES IS BACK • 17780 Tweets

Logan Act • 17398 Tweets

Junet • 13839 Tweets

Bielsa • 13547 Tweets

Hackett • 13248 Tweets

復活当選 • 12819 Tweets

Igor • 12181 Tweets

#DiyarbakırErcanGold

非公式wiki

꼬들 1012

雅マモル

Cameroon

LMSY TV SHOW PROGRAM

ChristineMae Lunar Eclipse

مارون الراس

Ulbrich

安室ちゃん

El Tiante

コラボアフタヌーンティー

NICOLA EN HOY

第992回

ニセさん

ドスケベ会員制バー

Woody Johnson

Luis Tiant

$AVENT

#Shakira

#اارفع_ترند_θち966ち8931

Last Seen Profiles

@yuguchi_gbm

@otomobilcomtr

@gimissari

@Snows

@sariyahrenae

@Morenbuou_long

@palmeirasonline

@atrueambivert

@KW_40L

@penikmat_bocill

@WeedStreet420

@ReLit

@MyrmcSolutions

@turk_ifsa2019

@Earnestine57885

@DMacWake316

@NebulaFi_

@dnt_click_here

@BusesLaUnion

@turk_ifsa2019

XY Han

@XYHan_

3 months

For those who love bagels

MathMatize Memes

@MathMatize

3 months

For those who love linear algebra

43

368

5K

15

239

6K

XY Han

@XYHan_

4 months

@miniapeur Clearly, he's stunned by how she got the TeX to render so nicely in a dm.

11

21

2K

XY Han

@XYHan_

6 months

During my PhD, my advisor would tell me “never use a symbol in text without reminding what it is.” [Example] 𝘉𝘢𝘥: “So, 𝜑 is bounded.” 𝘎𝘰𝘰𝘥: “So, the value function 𝜑 is bounded.”

John Horton

@johnjhorton

6 months

I told a grad-student co-author "try to imagine me as the guy from 'Memento' who can't remember anything & needs clues to pick-up the thread on projects" and they said "I believe you, because you already used that memento analogy"

22

867

20K

20

121

2K

XY Han

@XYHan_

1 month

@alz_zyd_ Curious about how many did numpy (e.g. np.array(x) > 3) vs those that did list comprehension ([y if y>3 for y in x])...

12

4

542

XY Han

@XYHan_

3 months

@alz_zyd_ The best part of the lore is this all happened because Candes and Tao started talking while waiting to pick up their kids.

2

30

509

XY Han

@XYHan_

2 years

Job search completed: Excited to join @ChicagoBooth as an Assistant Professor of Operations Management starting July 2024! 🥳 Thank you to all my friends and mentors who helped me along the way 🙏🙏🙏.

29

5

319

XY Han

@XYHan_

1 month

@alz_zyd_ Yeah, that's my sense too. I feel like "Python for Data Science/ML"-type classes never teach list comprehension and jump straight to numpy, so students who learn python that way never got exposed...

4

1

168

XY Han

@XYHan_

2 years

Honored to receive an ICLR 2022 Outstanding Paper Award for “Neural Collapse under MSE Loss” w/ Vardan Papyan and Dave Donoho! Come by #ICLR2022 on 4/26 1AM PST (Oral) & 4/27 6:30PM PST (Poster) to chat w/ us about Neural Collapse and its open questions!

4

15

110

XY Han

@XYHan_

5 months

64 GPUs for one research lab is pretty nice. Across all Stanford, @StanfordCompute has 700+ shared GPUs. Rumor is @StanfordData folks are talking of buying a new cluster of 1000+ GPUs. The point is valid, but the 64 GPU example feels misleading.

Technical specifications - Sherlock

User documentation for Sherlock, Stanford's HPC cluster.

www.sherlock.stanford.edu

Tsarathustra

@tsarnick

5 months

Fei-Fei Li says Stanford's Natural Language computing lab has only 64 GPUs and academia is "falling off a cliff" relative to industry

99

205

1K

11

13

75

XY Han

@XYHan_

7 months

…raising the blood pressures of Queuing theorists everywhere.

AK

@_akhaliq

7 months

Apple announces MM1 Methods, Analysis & Insights from Multimodal LLM Pre-training In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through

16

177

950

3

1

57

XY Han

@XYHan_

7 months

@petergyang I will sign up for whichever credit card that comes with free subscriptions to all five. @AmexBusiness @Chase @Citi

1

2

55

XY Han

@XYHan_

8 months

@sirbayes I thought a lot about this question: Did my PhD in a dept with optimizers who love linesearch. Within the classic optimization community (the SIOPT/ICCOPT/ISMP crowd) I think linesearch is pretty popular. Within ML, the problem is memory: you need to keep yet another copy of your

3

0

52

XY Han

@XYHan_

2 years

FOUR new Neural Collapse works accepted to #NeurIPS2022 investigating (1) Neural Collapse on different losses, (2) as modeled as Riemannian gradient flow, (3) as a motivating design network classifier design, and (4) under class imbalance. Congrats to all the authors!! 🥳

2

8

46

XY Han

@XYHan_

9 months

@miniapeur I’m concerned it isn’t compatible with my Hagoromos…

0

39

XY Han

@XYHan_

1 year

Fun story Erhan Çinlar told us in Princeton’s ORFE 309 course: Back in the day, US academics were deciding what to call math-for-decision-making. A good name already existed: 𝘤𝘺𝘣𝘦𝘳𝘯𝘦𝘵𝘪𝘤𝘴. 𝗕𝘂𝘁 it was the Cold War and “cybernetics” was a term already used by the

Yi Ma

@YiMaTweets

1 year

After “neural networks” and “artificial intelligence”, likely the next name that should and will redeem itself is “cybernetics”.

5

13

118

2

37

XY Han

@XYHan_

3 years

@weijie444 Does it feel good to continue using "we" despite it being a single-author paper as well? 😅

1

0

28

XY Han

@XYHan_

2 years

Recruiting faculty or postdocs in OR or ML? Check out the @Cornell_ORIE PhD candidates on the academic market this year! 🥳

0

6

28

XY Han

@XYHan_

2 years

TEN new Neural Collapse related submissions found in #ICLR2023 . 🤩🥳👏 The original NC paper took almost 3 years of exploring and experimentation to write. Extremely grateful to see so many now share our interest. 🙏🙏🙏

Andrei Bursuc

@abursuc

2 years

#ICLR2023 submissions are now visible or it's that time of the year when you realize that most of your #CVPR2023 ideas are already scooped 🙃

2

25

137

1

4

27

XY Han

@XYHan_

2 years

Amazing survey on the subtleties, historical contexts, and open questions of Neural Collapse. Very readable & comprehensive. One of the best so far! ⭐️⭐️⭐️⭐️⭐️ To authors @kvignesh1420 , E. Rasromani, & V. Awatramani: Thanks x💯 for your interest in NC and this fantastic review!

午後のarXiv

@arxivml

2 years

"Neural Collapse: A Review on Modelling Principles and Generalization", Vignesh Kothapalli, Ebrahim Rasromani, Vasu…

0

4

11

26

XY Han

@XYHan_

2 years

Two exciting new papers examining Neural Collapse in #ICML2022 (both spotlights!). Congratulations to the authors! (T. Tirer and @joanbruna ) and ( J. Zhou, X. Li, T. Ding, C. You, @Qing_Qu_1006 , and @ZhihuiZhu )

0

3

27

XY Han

@XYHan_

3 years

Proud to share this new work with my supervisor, Adrian Lewis, in which we develop a multipoint generalization of gradient descent for nonsmooth optimization. (1/4)

arXiv math.OC Optimization and Control

@mathOCb

3 years

X.Y. Han, Adrian S. Lewis: Survey Descent: A Multipoint Generalization of Gradient Descent for Nonsmooth Optimization

1

2

6

2

6

25

XY Han

@XYHan_

3 months

Feynman didn't live with people making 20k new submissions to ArXiv a month.

Physics In History

@PhysInHistory

3 months

Know how to solve every problem that has been solved. - Richard Feynman

60

215

2K

0

1

24

XY Han

@XYHan_

2 years

Why is Neural Collapse interesting? This and other discussions in this new interview. Thanks @aihuborg !

AIhub

@aihuborg

2 years

We interviewed @XYHan_ , Vardan Papyan, and David Donoho about their ICLR outstanding paper on the neural collapse phenomenon. Read what they had to say here:

0

1

10

1

7

24

XY Han

@XYHan_

4 months

Fun stories from @Princeton : During undergrad, Tarjan subbed for one of our Intro Algorithms (COS226) lectures. He started with this beautiful remark: “Hi, I’m Bob Tarjan… Not Bob Sedgewick. Bob Sedgewick wrote your textbook. I wrote the algorithm 𝘪𝘯 your textbook.”

Clément Canonne

@ccanonne_

4 months

Bob Tarjan, starting his talk on sorting with partial information at the @SimonsInstitute Sublinear Algorithms program

6

7

99

0

22

XY Han

@XYHan_

3 months

@YiMaTweets Knowledge = ∫ intelligence dt + c. 🤓

2

1

20

XY Han

@XYHan_

11 months

Turning off phone email notifications has significantly improved my quality of life.

2

0

20

XY Han

@XYHan_

1 year

On “Future Directions”: A Suggestion for the Academic Job Market “Future Directions” is often the hardest part of the research statement. Took me multiple rewrites. Eventually, I found the following trick useful. Imagine you got the faculty position. Visualize yourself living

1

0

19

XY Han

@XYHan_

2 years

How does neural collapse connect to prior works on implicit max-margin separation like Lyu & Li 2019, Soudry et al 2018, and Nacson et al 2019? W.Ji, @2prime_PKU , Y.Zhang, @zhun_deng & @weijie444 solidifies the connection in their new #ICLR2022 paper. 9:30PM EDT!

1

5

18

XY Han

@XYHan_

7 months

@miniapeur Algebraic General Intelligence

0

1

17

XY Han

@XYHan_

5 months

The benefit of being a @Cornell alum: Everyday either has great weather or reminds me nostalgically of my time as a PhD student.

1

0

15

XY Han

@XYHan_

2 years

Had a great time with great people @iccopt2022 ! Big thanks to the organizers!

2

1

14

XY Han

@XYHan_

4 months

👏👏👏 Much needed and overdue. A huge personal pain point for me as an opt researcher is that popular constrained opt solvers (cvxpy, gurobi, mosek, etc) require specialized syntax for the constraints and end up moving back to CPU and so can't advantage of GPU matmult... (1/2)

Fabian Schaipp

@FSchaipp

5 months

Together with @phschiele1 , we wrote a package to solve constrained optimization problems, where all functions are arbitrary @PyTorch modules. This is mainly intended for optimization with pre-trained NNs as objective/constraints.

3

13

52

1

13

XY Han

@XYHan_

5 months

Pre-2010, it went the other way. I still see Stats folks who roll their eyes at CS seminar speakers who lack mathematical rigor... and CS folks who dismiss Stats speakers as useless-for-SoTA. We all find different problems interesting: nobody's better than anyone else.

Mathieu Alain

@miniapeur

5 months

11

50

506

1

13

XY Han

@XYHan_

1 year

When I’m asked where one might start to learn about Neural Collapse. This survey is 𝘢𝘭𝘸𝘢𝘺𝘴 among my top recommendations. Ecstatic to see it cross the finish line. Congrats @kvignesh1420 !! (The reviews & discussion are amazing too! Hits on some key points 👏👏👏.)

Accepted papers at TMLR

@TmlrPub

1 year

Neural Collapse: A Review on Modelling Principles and Generalization Vignesh Kothapalli. Action editor: Jeffrey Pennington. #classifier #generalization #deep

0

11

30

3

1

13

XY Han

@XYHan_

8 months

@bradneuberg The only way to get them is to keep paying Google per hour. You can’t just buy one with funding—whether it’s VC, academic, or otherwise. Plus, earlier on, it only worked with Tensorflow. It was only 3-4 years after that PyTorch compatibility came along. The performance was

0

12

XY Han

@XYHan_

1 month

@RonSmithBaby @alz_zyd_ LOL That's actually a typo on my part. You're right.

1

0

13

XY Han

@XYHan_

5 months

Make sure conferences reject you for writing exactly the papers you want written. That, you can live with.

David Perell

@david_perell

5 months

Dave Letterman’s advice to Jerry Seinfeld: “Make sure you fail doing exactly what you want to do. That, you can live with.“

55

824

5K

0

12

XY Han

@XYHan_

8 months

@Adam235711 It’s useful for the “surrogate loss” argument in theory. Specifically: (1) somebody develops a convex loss that doesn’t do too bad; (2) most of the time, it doesn’t actually catch on outside of the research group that developed it; (3) But, using it doesn’t change the behavior of

1

0

12

XY Han

@XYHan_

8 months

Important info for new PI's buying compute hardware! It's more than just GPUs. If you don't get the interconnect (go for infiniband) and storage type (go for NVME or SAS SSDs) right, you're gonna get bottlenecked by dataloading no matter how good your GPUs are.

Stas Bekman

@StasBekman

8 months

The Machine Learning Engineering Networking chapter has been updated with multiple provider intra- and inter-node connectivity information/specs and easy to use bandwidth comparison tables: If I'm still missing some commonly used

5

10

137

2

1

11

XY Han

@XYHan_

1 year

Neural collapse observes last-layer class variation collapses 𝘵𝘰𝘸𝘢𝘳𝘥𝘴 0 with training. 𝗕𝘂𝘁: As it does, one can 𝘴𝘵𝘪𝘭𝘭 find informative, fine-grained structures in the residual small variations at 𝘧𝘪𝘹𝘦𝘥 epochs (even ones that look “collapsed”!). Check out this

Wei Hu

@weihu_

1 year

Today at #ICML2023 , @YongyiYang7 is presenting "Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations." We discovered important fine-grained structure exists in NN representations despite the apparent "Neural Collapse." See you at 2-3:30pm!

1

3

40

0

2

12

XY Han

@XYHan_

1 year

Since this got quoted (thanks @damekdavis !), good time to update that Survey Descent is now published in the SIAM Journal on Optimization!

1

10

XY Han

@XYHan_

3 years

Intuitive, 2-min overview of Neural Collapse on @Medium . Thanks, @Sharad45 !

What is Neural Collapse?

Terminal Phase of Training(TPT) — Training beyond 0 misclassification rate. Aim is to reduce the loss as much as possible even if…

link.medium.com

2

5

10

XY Han

@XYHan_

7 months

@sp_monte_carlo "linear convergence" was confusing af until @prof_grimmer told me during the 2nd year of my PhD "linear means linear in log-scale". I actually added a footnote to my job market research statement just to not confuse non-specialists:

0

11

XY Han

@XYHan_

1 month

@docmilanfar Spoiler: You can find @docmilanfar ’s face by zooming enhancing into the red squares using the Pixel 9.

0

11

XY Han

@XYHan_

7 months

I want a LLM-upgrade plan that switches my subscription whenever a new benchmark comes out. (GPT 4.5 -> Gemini Pro 1.5 -> Claude 3 -> ???) Maybe with unlimited text and data? @Verizon @ATT

Anthropic

@AnthropicAI

7 months

Today, we're announcing Claude 3, our next generation of AI models. The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision.

570

2K

10K

1

0

10

XY Han

@XYHan_

5 months

What if you pass out-of-distribution data into NNs showing neural collapse? How's that useful? Turns out, the out-of-distribution data become orthogonal to in-distribution data & you can then use that to detect those OOD points.

1

5

10

XY Han

@XYHan_

3 months

The next PhD life hack:

Sheel Mohnot

@pitdesi

3 months

TIL Costco sells a pallet of freeze-dried food. Preppers can buy 5,400 servings of food with a 25 year shelf life for $2,500 Lots of pasta and rice dishes, some soups, oatmeals & milk. Just add water! 910,080 calories, 364 calories/$.

106

173

5K

1

3

10

XY Han

@XYHan_

4 months

@ben_golub They do. Candidates can and do ask schools to match offers. The power is usually in a Dean’s hands. Prob is Deans have to be fair across the school (ex. Art & Sci or Eng) they oversee. Paying a new CS prof more than a tenured math prof will piss ppl off, for example.

1

0

9

XY Han

@XYHan_

2 years

@DrJimFan @DeepMind Suppose I write some exponential time procedure in RASP. Say, exhaustive search for solution of traveling salesman problem. Would the compiled transformer then be able to always give me the solution in the constant time of a forward pass? What am I missing?

1

0

10

XY Han

@XYHan_

4 months

@miniapeur Same~

1

0

8

XY Han

@XYHan_

3 months

@miniapeur

0

9

XY Han

@XYHan_

7 months

While a previous work claims happiness peaks at 10,000 H100 GPUs; a new PNAS study shows that happiness continues to grow with resources up to 500,000 H100 GPUs for the top 30% of the GPU rich.

One study said happiness peaked at $75,000 in income. Now, economists say it's higher — by a lot.

It turns out that money can buy happiness for most people — although some may remain miserable no matter what.

www.cbsnews.com

Bojan Tunguz

@tunguz

7 months

I just want everyone to be happy and have enough money to afford 10,000 H100s.

27

17

338

1

0

9

XY Han

@XYHan_

3 months

These were my H100s in high school~ ☺️

0

9

XY Han

@XYHan_

4 months

Don't forget "the Delta Method" (AKA Taylor's Theorem)

Mathieu Alain

@miniapeur

4 months

11

95

1K

0

9

XY Han

@XYHan_

1 year

At my fastest during the job market, I could pack for a four-day trip from scratch in <30 mins. Now, I’ve deteriorated to… more than twice that. Didn’t know you could lose muscle memory on these things… 😥

2

0

8

XY Han

@XYHan_

4 months

@docmilanfar It’s also mindblowing that a still-active researcher was already a grad student and part of someone’s origin story a year before I was even born. 😵‍💫

1

0

9

XY Han

@XYHan_

2 years

My biggest question: Was $42 chosen because it’s the optimal price or because of the Hitchhiker’s Guide to the Galaxy reference? 🪐

Harish Garg

@harishkgarg

2 years

OpenAI's chatGPT Pro plan is out - $42/mo

82

124

1K

1

7

XY Han

@XYHan_

8 months

@sirbayes This is also my guess for why (L-)BFGS and bundle methods — which you hear a but about in the opt community, but not as much in ML — aren’t more popular.

0

8

XY Han

@XYHan_

1 year

@damekdavis Check out this talk by Dave Donoho at IHES! He elaborates on this point and the existence of two cultures (empirical results vs theorem proving) there.

David Donoho - The Bridge from Mathematical to Digital, and Back

Stephane Mallat has, across 30 years, evolved continuously as the fields of computational harmonic analysis, signal processing, and image processing have evo...

www.youtube.com

0

8

XY Han

@XYHan_

3 months

This is my default grading policy as well: • If you do A+ work with AI, you get an A+. • If the AI plagiarized or made things up, you get an 0 as if you did it yourself.

alz

@alz_zyd_

3 months

I just don't get why people are trying to detect whether AI is being used for writing instead of just grading whether the writing is good or bad. If student A uses AI and writes better than student B, student A should get a better grade than student B

569

89

2K

0

1

8

XY Han

@XYHan_

5 months

Also, in the tech specs, note the 1493 “privately owned” nodes. Those are nodes associated with specific PI groups (many containing GPUs). Sherlock contributors can use the idle nodes of other PI groups as well making effective number of shared GPUs much higher than 700.

0

8

XY Han

@XYHan_

5 months

I nominate the NeurIPS latex template for the Test of Time Award.

Zico Kolter

@zicokolter

3 years

Yearly reminder: To get natbib \citep and \citet commands working properly with NeurIPS citation style (numbers rather than author lists), use the following commands. \usepackage[nonatbib]{neurips_2021} \usepackage[numbers]{natbib} ... \bibliographystyle{abbrvnat}

4

18

302

0

7

XY Han

@XYHan_

3 months

@miniapeur There’s this guy who was a physicist who became a US senator for New Jersey. I remember lots of faculty at @Princeton liked him because it was nice getting represented by a real scientist. .

Rush Holt Jr. - Wikipedia

en.m.wikipedia.org

2

0

7

XY Han

@XYHan_

10 months

@YiMaTweets What do you think of WizardLM from PKU and Yi from as representatives of Chinese open source AI? They seem to be doing well in the leaderboards…

0

6

XY Han

@XYHan_

1 year

Honestly, I prefer reading clear GPT-assisted emails rather than deciphering intentions in unclear, "organic" ones. ChatGPT is effectively the modern version of spellcheck/Grammarly. The same for class assignments --- as long as students take responsibility for GPT-induced errors

Ravid Shwartz Ziv

@ziv_ravid

1 year

It's unjust to criticize students, especially non-native English speakers for using ChatGPT to communicate. They might be investing extra time in crafting these emails, navigating linguistic and cultural nuances. Just to clarify,this tweet was crafted with the help of ChatGPT!

4

1

40

1

0

7

XY Han

@XYHan_

2 years

This 💯

Zico Kolter

@zicokolter

2 years

I realize this is seemingly an unpopular opinion, but I can't get onboard with these Twitter criticisms of some of the recent #ICML2022 best paper awardees. I've been thinking about this all day. A thread... 🧵 1/N

20

85

910

0

7

XY Han

@XYHan_

4 months

"No management overhead or product cycles" & "insulated from short-term commercial pressures" is. literally. academia. But, instead of asking NSF for $200-500k, @ssi raised many times that from VCs purely on reputation. This is what happens when you beat the game🤯.

SSI Inc.

@ssi

4 months

Superintelligence is within reach. Building safe superintelligence (SSI) is the most important technical problem of our time. We've started the world’s first straight-shot SSI lab, with one goal and one product: a safe superintelligence. It’s called Safe Superintelligence

1K

2K

14K

1

0

7

XY Han

@XYHan_

9 months

@Adam235711 Anyone who can’t autocompile it in their heads isn’t the intended audience. 💪

0

7

XY Han

@XYHan_

5 months

Fixed my AC with GPT-4o by passing an image to it. 🥳👏

1

7

XY Han

@XYHan_

4 months

Sometimes, I wonder if I really became a better worker during my PhD or if it’s just because ChatGPT came out during that time.

The Technology Brother

@thetechbrother

4 months

How to write a for loop again?

11

7

204

0

7

XY Han

@XYHan_

1 year

@damekdavis My personal thoughts (influenced by co-teaching a course on this) is that the definition of "way forward" is slightly vague. If we define it as (1) creating new tools that push forward society. Then, even if there is something special about transformers and neural nets,

1

0

6

XY Han

@XYHan_

5 months

From Dave Donoho's Data Science at the Singularity:

Tsarathustra

@tsarnick

5 months

Eric Schmidt says AI is under-hyped because the scaling laws are continuing without any loss of power

27

101

543

1

2

7

XY Han

@XYHan_

4 months

@ben_golub My understanding (from having done a salary neg b/n a bschool and an eschool) is there’s flexibility in the pay, but the discrepancy can’t be too big and needs to be justifiable by how much money the dept’s masters program and alum donations pull in.

XY Han

@XYHan_

4 months

@zacharylipton Is this really a university-level decision? Isn’t dept funding and salaries is tied to the profitability of the corresponding Masters-level program? As in, Tepper’s MBA tuition ($39k) is 1.34x the SCS MS tuition ($29k) at CMU.

1

0

4

1

0

7

XY Han

@XYHan_

8 months

@Adam235711 From what I’ve seen, SOTA methods tend to come out of lots of trial-and-error by researchers who are good at having hunches about data and choosing which ones to act on. In implementation, they draw from their math-education to make design decisions. Since opt courses tends to

1

0

6

XY Han

@XYHan_

1 year

My texts look nice now~ 😁

0

6

XY Han

@XYHan_

9 months

@bhutanisanyam1 Such a guide would be immensely helpful. I am an academic AI researcher who recently went through the (quite challenging) process of building a GPU cluster. Questions I wished I understood beforehand (and still am still fuzzy on) are the following: 1) What should researchers

0

6

XY Han

@XYHan_

2 years

East Ithaca running trail through #dalle impressionism~

1

0

6

XY Han

@XYHan_

7 months

@QuanquanGu The M/M/1 queue is to Queuing theory what the Gaussian distribution is to statistics.

M/M/1 queue - Wikipedia

en.m.wikipedia.org

1

0

6

XY Han

@XYHan_

1 month

@docmilanfar I like that he looks like Math Gandalf and is, in fact, Math Gandalf.

0

5

XY Han

@XYHan_

7 months

In Operations Research, through the lens of DLD's Data Science at the Singularity, it's trickier to achieve both [FR1: Common Data] and [FR3: Common Benchmarks] since modeling context/structure in OR often entails modifying the data collection itself.

Vishal Gupta

@vishalguptaphd

7 months

@ProfKuangXu The challenge I’ve seen is that many benchmark datasets strip away a lot of problem context — like the miplib library. Without the richness of setting, it’s hard to use them broadly.

2

0

3

2

0

6

XY Han

@XYHan_

2 years

Interesting new paper proposing a NC-inspired loss that mitigates undesirable biases when training deep nets on imbalanced data. Builds upon the prior work of C.Fang, @hangfeng_he , @DrQiLong , & @weijie444 showing minority collapse under imbalanced training. (1/2)

Deep RL

@deep_rl

2 years

Neural Collapse Inspired Attraction-Repulsion-Balanced Loss for Imbalanced Learning - Liang Xie

0

2

1

2

6

XY Han

@XYHan_

2 years

Interested in a POST-DOC combining AI/ML and optimization? Prof. Baris Ata ( @ChicagoBooth ) is HIRING! Contact: Link:

0

1

6

XY Han

@XYHan_

6 months

The type of creativity you can’t replace (yet?) with AI~ 👏

Tiffany Ding

@tifding

6 months

The greatest accomplishment of my statistics career has been winning this year’s @UCBStatistics T-shirt design competition with a @SFBART -inspired shirt designed w/ @aashen12 ! {stats nerds} ∩ {public transit nerds} ≠ ∅ 📉🚅

4

9

72

0

5

XY Han

@XYHan_

3 months

@alz_zyd_ Uncomfortable part is that a non-negligible part of K-12 teachers’ skillset is memorizing and teaching students to go through the motions. If AI is allowed in K-12 classrooms, it calls into question the entire training of K-12 teachers trained pre-2022. It’s clear the curriculum

0

5

XY Han

@XYHan_

2 years

Library of Babel by Luis Borges. Generated using first few lines using #dalle !

Clément Canonne

@ccanonne_

2 years

Your periodic reminder that, no matter how novel your ideas and groundbreaking your paper, you already have been scooped by Jorge Luis Borges.

7

9

98

1

0

4

XY Han

@XYHan_

7 months

@miniapeur 𝚝𝚘𝚛𝚌𝚑.𝚕𝚒𝚗𝚊𝚕𝚐.𝚒𝚗𝚟(𝙰)

0

5

XY Han

@XYHan_

1 year

@DimitrisPapail They say every painting is a self-portrait...

1

0

5

XY Han

@XYHan_

10 months

@schmidtdominik_ @MinqiJiang What software did you use to generate those UMAP plots? They are beautiful…

1

0

5

XY Han

@XYHan_

4 months

Anyone know how to search the #ISMP2024 program? From the site, seems the only way is to click open each individual session or speaker name to see what talks there are for that particular session/person?? @math_opt

0

2

5

XY Han

@XYHan_

1 year

"Watercolor portrayal of a sunset scene where supercomputer mountains are silhouetted against a fiery sky. Streams of luminous data pour from their summits, filling a gleaming lake below that reflects the array of scientific discoveries." #DallE3 🤩

0

5

XY Han

@XYHan_

5 months

Check out this new #ICLR2024 paper by @Mouin_ben_Ammar , @nacim_belkhir , @SebastianGP13 , A. Manzanera, & @GianniFranchi10

NECO: NEural Collapse Based Out-of-distribution detection

Detecting out-of-distribution (OOD) data is a critical challenge in machine learning due to model overconfidence, often without awareness of their epistemological limits. We hypothesize that...

openreview.net

1

2

5

XY Han

@XYHan_

8 months

@ShiqianMa @sirbayes I thought about it from this angle too: Forward passing on a deep net is expensive. But I couldn't quite convince myself for the following reasons. (1) You have to do function evaluations for SGD too. In fact, linesearch evaluations are better evaluation/compute-wise because you

0

5

XY Han

@XYHan_

4 months

@ben_golub 😮 I didn’t know about that. I can only speak to the discrepancy between operation research (engineering) and operations management (business) that draws from the same pool of people. From what you say, a different mechanism does seem at work there… thanks for the insight!

1

0

5

XY Han

@XYHan_

3 years

Sparsity is achieved *without* the familiar l1 penalization. And it simply uses gradient descent on kernel loss without extra “tricks”. Worthwhile read with original ideas and new techniques.

cs.LG Papers

@arxiv_cs_LG

3 years

On the Self-Penalization Phenomenon in Feature Selection. Michael I. Jordan, Keli Liu, and Feng Ruan

1

0

1

0

5

XY Han

@XYHan_

4 months

@damekdavis @MountainOfMoon @HDSIUCSD @UCSanDiego Congrats to you both 🥳! I’m now waiting for the singularity event if you two ever end up in the same place. 🤠

0

5

XY Han

@XYHan_

3 months

@docmilanfar So that’s why some top players in a field act like asses. 🫏

1

0

5

XY Han

@XYHan_

7 months

@Ian_yzhu How long before you become the person who prints PDF, handmarks them, and then emails low-res scanned copy back? 😏

0

4

XY Han

@XYHan_

27 days

@xwang_lk There’s @CPALconf that will be in Stanford next year~

0

4

XY Han

@XYHan_

7 months

“I have a truly marvelous demonstration of 1M token context which this margin is too narrow to contain.”

Carlos E. Perez

@IntuitMachine

8 months

Demis Hassabis admits that Google has some secret sauce in how Gemini is able to process 1-10m token context windows. The extreme context length in Gemini 1.5 Pro "can't" be achieved "without some new Innovations". This is an astonishing development that seems to hint at

15

73

368

0

3

XY Han

@XYHan_

1 year

[Then:] Read the final, published paper. It’s the most polished! [Now:] Find the preprint. It’s formatted as the authors wanted without copyediting artifacts.

1

0

4