XY Han Profile Banner
XY Han Profile
XY Han

@XYHan_

Followers
1,479
Following
1,131
Media
76
Statuses
1,440

Assistant Professor @ChicagoBooth | Papers: “Neural Collapse in Deep Nets” & “Survey Descent: Nonsmooth GD” | BSE @Princeton , MS @Stanford , PhD @Cornell

Chicago, IL
Joined November 2011
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@XYHan_
XY Han
3 months
For those who love bagels
Tweet media one
@MathMatize
MathMatize Memes
3 months
For those who love linear algebra
Tweet media one
43
368
5K
15
239
6K
@XYHan_
XY Han
4 months
@miniapeur Clearly, he's stunned by how she got the TeX to render so nicely in a dm.
11
21
2K
@XYHan_
XY Han
6 months
During my PhD, my advisor would tell me “never use a symbol in text without reminding what it is.” [Example] 𝘉𝘢𝘥: “So, 𝜑 is bounded.” 𝘎𝘰𝘰𝘥: “So, the value function 𝜑 is bounded.”
@johnjhorton
John Horton
6 months
I told a grad-student co-author "try to imagine me as the guy from 'Memento' who can't remember anything & needs clues to pick-up the thread on projects" and they said "I believe you, because you already used that memento analogy"
22
867
20K
20
121
2K
@XYHan_
XY Han
1 month
@alz_zyd_ Curious about how many did numpy (e.g. np.array(x) > 3) vs those that did list comprehension ([y if y>3 for y in x])...
12
4
542
@XYHan_
XY Han
3 months
@alz_zyd_ The best part of the lore is this all happened because Candes and Tao started talking while waiting to pick up their kids.
Tweet media one
2
30
509
@XYHan_
XY Han
2 years
Job search completed: Excited to join @ChicagoBooth as an Assistant Professor of Operations Management starting July 2024! 🥳 Thank you to all my friends and mentors who helped me along the way 🙏🙏🙏.
Tweet media one
29
5
319
@XYHan_
XY Han
1 month
@alz_zyd_ Yeah, that's my sense too. I feel like "Python for Data Science/ML"-type classes never teach list comprehension and jump straight to numpy, so students who learn python that way never got exposed...
4
1
168
@XYHan_
XY Han
2 years
Honored to receive an ICLR 2022 Outstanding Paper Award for “Neural Collapse under MSE Loss” w/ Vardan Papyan and Dave Donoho! Come by #ICLR2022 on 4/26 1AM PST (Oral) & 4/27 6:30PM PST (Poster) to chat w/ us about Neural Collapse and its open questions!
Tweet media one
4
15
110
@XYHan_
XY Han
5 months
64 GPUs for one research lab is pretty nice. Across all Stanford, @StanfordCompute has 700+ shared GPUs. Rumor is @StanfordData folks are talking of buying a new cluster of 1000+ GPUs. The point is valid, but the 64 GPU example feels misleading.
@tsarnick
Tsarathustra
5 months
Fei-Fei Li says Stanford's Natural Language computing lab has only 64 GPUs and academia is "falling off a cliff" relative to industry
99
205
1K
11
13
75
@XYHan_
XY Han
7 months
…raising the blood pressures of Queuing theorists everywhere.
@_akhaliq
AK
7 months
Apple announces MM1 Methods, Analysis & Insights from Multimodal LLM Pre-training In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through
Tweet media one
16
177
950
3
1
57
@XYHan_
XY Han
7 months
@petergyang I will sign up for whichever credit card that comes with free subscriptions to all five. @AmexBusiness @Chase @Citi
1
2
55
@XYHan_
XY Han
8 months
@sirbayes I thought a lot about this question: Did my PhD in a dept with optimizers who love linesearch. Within the classic optimization community (the SIOPT/ICCOPT/ISMP crowd) I think linesearch is pretty popular. Within ML, the problem is memory: you need to keep yet another copy of your
3
0
52
@XYHan_
XY Han
2 years
FOUR new Neural Collapse works accepted to #NeurIPS2022 investigating (1) Neural Collapse on different losses, (2) as modeled as Riemannian gradient flow, (3) as a motivating design network classifier design, and (4) under class imbalance. Congrats to all the authors!! 🥳
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
8
46
@XYHan_
XY Han
9 months
@miniapeur I’m concerned it isn’t compatible with my Hagoromos…
0
0
39
@XYHan_
XY Han
1 year
Fun story Erhan Çinlar told us in Princeton’s ORFE 309 course: Back in the day, US academics were deciding what to call math-for-decision-making. A good name already existed: 𝘤𝘺𝘣𝘦𝘳𝘯𝘦𝘵𝘪𝘤𝘴. 𝗕𝘂𝘁 it was the Cold War and “cybernetics” was a term already used by the
@YiMaTweets
Yi Ma
1 year
After “neural networks” and “artificial intelligence”, likely the next name that should and will redeem itself is “cybernetics”.
5
13
118
2
2
37
@XYHan_
XY Han
3 years
@weijie444 Does it feel good to continue using "we" despite it being a single-author paper as well? 😅
1
0
28
@XYHan_
XY Han
2 years
Recruiting faculty or postdocs in OR or ML? Check out the @Cornell_ORIE PhD candidates on the academic market this year! 🥳
Tweet media one
0
6
28
@XYHan_
XY Han
2 years
TEN new Neural Collapse related submissions found in #ICLR2023 . 🤩🥳👏 The original NC paper took almost 3 years of exploring and experimentation to write. Extremely grateful to see so many now share our interest. 🙏🙏🙏
Tweet media one
Tweet media two
@abursuc
Andrei Bursuc
2 years
#ICLR2023 submissions are now visible or it's that time of the year when you realize that most of your #CVPR2023 ideas are already scooped 🙃
2
25
137
1
4
27
@XYHan_
XY Han
2 years
Amazing survey on the subtleties, historical contexts, and open questions of Neural Collapse. Very readable & comprehensive. One of the best so far! ⭐️⭐️⭐️⭐️⭐️ To authors @kvignesh1420 , E. Rasromani, & V. Awatramani: Thanks x💯 for your interest in NC and this fantastic review!
Tweet media one
Tweet media two
Tweet media three
@arxivml
午後のarXiv
2 years
"Neural Collapse: A Review on Modelling Principles and Generalization", Vignesh Kothapalli, Ebrahim Rasromani, Vasu…
0
0
4
4
11
26
@XYHan_
XY Han
2 years
Two exciting new papers examining Neural Collapse in #ICML2022 (both spotlights!). Congratulations to the authors! (T. Tirer and @joanbruna ) and ( J. Zhou, X. Li, T. Ding, C. You, @Qing_Qu_1006 , and @ZhihuiZhu )
Tweet media one
Tweet media two
0
3
27
@XYHan_
XY Han
3 years
Proud to share this new work with my supervisor, Adrian Lewis, in which we develop a multipoint generalization of gradient descent for nonsmooth optimization. (1/4)
@mathOCb
arXiv math.OC Optimization and Control
3 years
X.Y. Han, Adrian S. Lewis: Survey Descent: A Multipoint Generalization of Gradient Descent for Nonsmooth Optimization
1
2
6
2
6
25
@XYHan_
XY Han
3 months
Feynman didn't live with people making 20k new submissions to ArXiv a month.
Tweet media one
@PhysInHistory
Physics In History
3 months
Know how to solve every problem that has been solved. - Richard Feynman
Tweet media one
60
215
2K
0
1
24
@XYHan_
XY Han
2 years
Why is Neural Collapse interesting? This and other discussions in this new interview. Thanks @aihuborg !
@aihuborg
AIhub
2 years
We interviewed @XYHan_ , Vardan Papyan, and David Donoho about their ICLR outstanding paper on the neural collapse phenomenon. Read what they had to say here:
Tweet media one
0
1
10
1
7
24
@XYHan_
XY Han
4 months
Fun stories from @Princeton : During undergrad, Tarjan subbed for one of our Intro Algorithms (COS226) lectures. He started with this beautiful remark: “Hi, I’m Bob Tarjan… Not Bob Sedgewick. Bob Sedgewick wrote your textbook. I wrote the algorithm 𝘪𝘯 your textbook.”
@ccanonne_
Clément Canonne
4 months
Bob Tarjan, starting his talk on sorting with partial information at the @SimonsInstitute Sublinear Algorithms program
Tweet media one
6
7
99
0
0
22
@XYHan_
XY Han
3 months
@YiMaTweets Knowledge = ∫ intelligence dt + c. 🤓
2
1
20
@XYHan_
XY Han
11 months
Turning off phone email notifications has significantly improved my quality of life.
2
0
20
@XYHan_
XY Han
1 year
On “Future Directions”: A Suggestion for the Academic Job Market “Future Directions” is often the hardest part of the research statement. Took me multiple rewrites. Eventually, I found the following trick useful. Imagine you got the faculty position. Visualize yourself living
1
0
19
@XYHan_
XY Han
2 years
How does neural collapse connect to prior works on implicit max-margin separation like Lyu & Li 2019, Soudry et al 2018, and Nacson et al 2019? W.Ji, @2prime_PKU , Y.Zhang, @zhun_deng & @weijie444 solidifies the connection in their new #ICLR2022 paper. 9:30PM EDT!
Tweet media one
1
5
18
@XYHan_
XY Han
7 months
@miniapeur Algebraic General Intelligence
0
1
17
@XYHan_
XY Han
5 months
The benefit of being a @Cornell alum: Everyday either has great weather or reminds me nostalgically of my time as a PhD student.
1
0
15
@XYHan_
XY Han
2 years
Had a great time with great people @iccopt2022 ! Big thanks to the organizers!
2
1
14
@XYHan_
XY Han
4 months
👏👏👏 Much needed and overdue. A huge personal pain point for me as an opt researcher is that popular constrained opt solvers (cvxpy, gurobi, mosek, etc) require specialized syntax for the constraints and end up moving back to CPU and so can't advantage of GPU matmult... (1/2)
@FSchaipp
Fabian Schaipp
5 months
Together with @phschiele1 , we wrote a package to solve constrained optimization problems, where all functions are arbitrary @PyTorch modules. This is mainly intended for optimization with pre-trained NNs as objective/constraints.
3
13
52
1
1
13
@XYHan_
XY Han
5 months
Pre-2010, it went the other way. I still see Stats folks who roll their eyes at CS seminar speakers who lack mathematical rigor... and CS folks who dismiss Stats speakers as useless-for-SoTA. We all find different problems interesting: nobody's better than anyone else.
@miniapeur
Mathieu Alain
5 months
Tweet media one
11
50
506
1
1
13
@XYHan_
XY Han
1 year
When I’m asked where one might start to learn about Neural Collapse. This survey is 𝘢𝘭𝘸𝘢𝘺𝘴 among my top recommendations. Ecstatic to see it cross the finish line. Congrats @kvignesh1420 !! (The reviews & discussion are amazing too! Hits on some key points 👏👏👏.)
@TmlrPub
Accepted papers at TMLR
1 year
Neural Collapse: A Review on Modelling Principles and Generalization Vignesh Kothapalli. Action editor: Jeffrey Pennington. #classifier #generalization #deep
0
11
30
3
1
13
@XYHan_
XY Han
8 months
@bradneuberg The only way to get them is to keep paying Google per hour. You can’t just buy one with funding—whether it’s VC, academic, or otherwise. Plus, earlier on, it only worked with Tensorflow. It was only 3-4 years after that PyTorch compatibility came along. The performance was
0
0
12
@XYHan_
XY Han
1 month
@RonSmithBaby @alz_zyd_ LOL That's actually a typo on my part. You're right.
1
0
13
@XYHan_
XY Han
5 months
Make sure conferences reject you for writing exactly the papers you want written. That, you can live with.
@david_perell
David Perell
5 months
Dave Letterman’s advice to Jerry Seinfeld: “Make sure you fail doing exactly what you want to do. That, you can live with.“
55
824
5K
0
0
12
@XYHan_
XY Han
8 months
@Adam235711 It’s useful for the “surrogate loss” argument in theory. Specifically: (1) somebody develops a convex loss that doesn’t do too bad; (2) most of the time, it doesn’t actually catch on outside of the research group that developed it; (3) But, using it doesn’t change the behavior of
1
0
12
@XYHan_
XY Han
8 months
Important info for new PI's buying compute hardware! It's more than just GPUs. If you don't get the interconnect (go for infiniband) and storage type (go for NVME or SAS SSDs) right, you're gonna get bottlenecked by dataloading no matter how good your GPUs are.
@StasBekman
Stas Bekman
8 months
The Machine Learning Engineering Networking chapter has been updated with multiple provider intra- and inter-node connectivity information/specs and easy to use bandwidth comparison tables: If I'm still missing some commonly used
Tweet media one
5
10
137
2
1
11
@XYHan_
XY Han
1 year
Neural collapse observes last-layer class variation collapses 𝘵𝘰𝘸𝘢𝘳𝘥𝘴 0 with training. 𝗕𝘂𝘁: As it does, one can 𝘴𝘵𝘪𝘭𝘭 find informative, fine-grained structures in the residual small variations at 𝘧𝘪𝘹𝘦𝘥 epochs (even ones that look “collapsed”!). Check out this
@weihu_
Wei Hu
1 year
Today at #ICML2023 , @YongyiYang7 is presenting "Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations." We discovered important fine-grained structure exists in NN representations despite the apparent "Neural Collapse." See you at 2-3:30pm!
Tweet media one
1
3
40
0
2
12
@XYHan_
XY Han
1 year
Since this got quoted (thanks @damekdavis !), good time to update that Survey Descent is now published in the SIAM Journal on Optimization!
Tweet media one
Tweet media two
Tweet media three
1
1
10
@XYHan_
XY Han
7 months
@sp_monte_carlo "linear convergence" was confusing af until @prof_grimmer told me during the 2nd year of my PhD "linear means linear in log-scale". I actually added a footnote to my job market research statement just to not confuse non-specialists:
Tweet media one
0
0
11
@XYHan_
XY Han
1 month
@docmilanfar Spoiler: You can find @docmilanfar ’s face by zooming enhancing into the red squares using the Pixel 9.
0
0
11
@XYHan_
XY Han
7 months
I want a LLM-upgrade plan that switches my subscription whenever a new benchmark comes out. (GPT 4.5 -> Gemini Pro 1.5 -> Claude 3 -> ???) Maybe with unlimited text and data? @Verizon @ATT
@AnthropicAI
Anthropic
7 months
Today, we're announcing Claude 3, our next generation of AI models. The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision.
Tweet media one
570
2K
10K
1
0
10
@XYHan_
XY Han
5 months
What if you pass out-of-distribution data into NNs showing neural collapse? How's that useful? Turns out, the out-of-distribution data become orthogonal to in-distribution data & you can then use that to detect those OOD points.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
5
10
@XYHan_
XY Han
3 months
The next PhD life hack:
@pitdesi
Sheel Mohnot
3 months
TIL Costco sells a pallet of freeze-dried food. Preppers can buy 5,400 servings of food with a 25 year shelf life for $2,500 Lots of pasta and rice dishes, some soups, oatmeals & milk. Just add water! 910,080 calories, 364 calories/$.
Tweet media one
106
173
5K
1
3
10
@XYHan_
XY Han
4 months
@ben_golub They do. Candidates can and do ask schools to match offers. The power is usually in a Dean’s hands. Prob is Deans have to be fair across the school (ex. Art & Sci or Eng) they oversee. Paying a new CS prof more than a tenured math prof will piss ppl off, for example.
1
0
9
@XYHan_
XY Han
2 years
@DrJimFan @DeepMind Suppose I write some exponential time procedure in RASP. Say, exhaustive search for solution of traveling salesman problem. Would the compiled transformer then be able to always give me the solution in the constant time of a forward pass? What am I missing?
1
0
10
@XYHan_
XY Han
4 months
1
0
8
@XYHan_
XY Han
3 months
Tweet media one
0
0
9
@XYHan_
XY Han
7 months
While a previous work claims happiness peaks at 10,000 H100 GPUs; a new PNAS study shows that happiness continues to grow with resources up to 500,000 H100 GPUs for the top 30% of the GPU rich.
@tunguz
Bojan Tunguz
7 months
I just want everyone to be happy and have enough money to afford 10,000 H100s.
27
17
338
1
0
9
@XYHan_
XY Han
3 months
These were my H100s in high school~ ☺️
Tweet media one
0
0
9
@XYHan_
XY Han
4 months
Don't forget "the Delta Method" (AKA Taylor's Theorem)
@miniapeur
Mathieu Alain
4 months
Tweet media one
11
95
1K
0
0
9
@XYHan_
XY Han
1 year
At my fastest during the job market, I could pack for a four-day trip from scratch in <30 mins. Now, I’ve deteriorated to… more than twice that. Didn’t know you could lose muscle memory on these things… 😥
2
0
8
@XYHan_
XY Han
4 months
@docmilanfar It’s also mindblowing that a still-active researcher was already a grad student and part of someone’s origin story a year before I was even born. 😵‍💫
1
0
9
@XYHan_
XY Han
2 years
My biggest question: Was $42 chosen because it’s the optimal price or because of the Hitchhiker’s Guide to the Galaxy reference? 🪐
@harishkgarg
Harish Garg
2 years
OpenAI's chatGPT Pro plan is out - $42/mo
Tweet media one
82
124
1K
1
1
7
@XYHan_
XY Han
8 months
@sirbayes This is also my guess for why (L-)BFGS and bundle methods — which you hear a but about in the opt community, but not as much in ML — aren’t more popular.
0
0
8
@XYHan_
XY Han
3 months
This is my default grading policy as well: • If you do A+ work with AI, you get an A+. • If the AI plagiarized or made things up, you get an 0 as if you did it yourself.
@alz_zyd_
alz
3 months
I just don't get why people are trying to detect whether AI is being used for writing instead of just grading whether the writing is good or bad. If student A uses AI and writes better than student B, student A should get a better grade than student B
569
89
2K
0
1
8
@XYHan_
XY Han
5 months
Also, in the tech specs, note the 1493 “privately owned” nodes. Those are nodes associated with specific PI groups (many containing GPUs). Sherlock contributors can use the idle nodes of other PI groups as well making effective number of shared GPUs much higher than 700.
0
0
8
@XYHan_
XY Han
5 months
I nominate the NeurIPS latex template for the Test of Time Award.
@zicokolter
Zico Kolter
3 years
Yearly reminder: To get natbib \citep and \citet commands working properly with NeurIPS citation style (numbers rather than author lists), use the following commands. \usepackage[nonatbib]{neurips_2021} \usepackage[numbers]{natbib} ... \bibliographystyle{abbrvnat}
4
18
302
0
0
7
@XYHan_
XY Han
3 months
@miniapeur There’s this guy who was a physicist who became a US senator for New Jersey. I remember lots of faculty at @Princeton liked him because it was nice getting represented by a real scientist. .
2
0
7
@XYHan_
XY Han
10 months
@YiMaTweets What do you think of WizardLM from PKU and Yi from as representatives of Chinese open source AI? They seem to be doing well in the leaderboards…
0
0
6
@XYHan_
XY Han
1 year
Honestly, I prefer reading clear GPT-assisted emails rather than deciphering intentions in unclear, "organic" ones. ChatGPT is effectively the modern version of spellcheck/Grammarly. The same for class assignments --- as long as students take responsibility for GPT-induced errors
@ziv_ravid
Ravid Shwartz Ziv
1 year
It's unjust to criticize students, especially non-native English speakers for using ChatGPT to communicate. They might be investing extra time in crafting these emails, navigating linguistic and cultural nuances. Just to clarify,this tweet was crafted with the help of ChatGPT!
4
1
40
1
0
7
@XYHan_
XY Han
2 years
This 💯
@zicokolter
Zico Kolter
2 years
I realize this is seemingly an unpopular opinion, but I can't get onboard with these Twitter criticisms of some of the recent #ICML2022 best paper awardees. I've been thinking about this all day. A thread... 🧵 1/N
20
85
910
0
0
7
@XYHan_
XY Han
4 months
"No management overhead or product cycles" & "insulated from short-term commercial pressures" is. literally. academia. But, instead of asking NSF for $200-500k, @ssi raised many times that from VCs purely on reputation. This is what happens when you beat the game🤯.
@ssi
SSI Inc.
4 months
Superintelligence is within reach. Building safe superintelligence (SSI) is the most important technical problem of our​​ time. We've started the world’s first straight-shot SSI lab, with one goal and one product: a safe superintelligence. It’s called Safe Superintelligence
1K
2K
14K
1
0
7
@XYHan_
XY Han
9 months
@Adam235711 Anyone who can’t autocompile it in their heads isn’t the intended audience. 💪
0
0
7
@XYHan_
XY Han
5 months
Fixed my AC with GPT-4o by passing an image to it. 🥳👏
Tweet media one
1
1
7
@XYHan_
XY Han
4 months
Sometimes, I wonder if I really became a better worker during my PhD or if it’s just because ChatGPT came out during that time.
@thetechbrother
The Technology Brother
4 months
How to write a for loop again?
Tweet media one
11
7
204
0
0
7
@XYHan_
XY Han
1 year
@damekdavis My personal thoughts (influenced by co-teaching a course on this) is that the definition of "way forward" is slightly vague. If we define it as (1) creating new tools that push forward society. Then, even if there is something special about transformers and neural nets,
Tweet media one
1
0
6
@XYHan_
XY Han
5 months
From Dave Donoho's Data Science at the Singularity:
Tweet media one
@tsarnick
Tsarathustra
5 months
Eric Schmidt says AI is under-hyped because the scaling laws are continuing without any loss of power
27
101
543
1
2
7
@XYHan_
XY Han
4 months
@ben_golub My understanding (from having done a salary neg b/n a bschool and an eschool) is there’s flexibility in the pay, but the discrepancy can’t be too big and needs to be justifiable by how much money the dept’s masters program and alum donations pull in.
@XYHan_
XY Han
4 months
@zacharylipton Is this really a university-level decision? Isn’t dept funding and salaries is tied to the profitability of the corresponding Masters-level program? As in, Tepper’s MBA tuition ($39k) is 1.34x the SCS MS tuition ($29k) at CMU.
1
0
4
1
0
7
@XYHan_
XY Han
8 months
@Adam235711 From what I’ve seen, SOTA methods tend to come out of lots of trial-and-error by researchers who are good at having hunches about data and choosing which ones to act on. In implementation, they draw from their math-education to make design decisions. Since opt courses tends to
1
0
6
@XYHan_
XY Han
1 year
My texts look nice now~ 😁
Tweet media one
0
0
6
@XYHan_
XY Han
9 months
@bhutanisanyam1 Such a guide would be immensely helpful. I am an academic AI researcher who recently went through the (quite challenging) process of building a GPU cluster. Questions I wished I understood beforehand (and still am still fuzzy on) are the following: 1) What should researchers
0
0
6
@XYHan_
XY Han
2 years
East Ithaca running trail through #dalle impressionism~
Tweet media one
1
0
6
@XYHan_
XY Han
7 months
@QuanquanGu The M/M/1 queue is to Queuing theory what the Gaussian distribution is to statistics.
1
0
6
@XYHan_
XY Han
1 month
@docmilanfar I like that he looks like Math Gandalf and is, in fact, Math Gandalf.
0
0
5
@XYHan_
XY Han
7 months
In Operations Research, through the lens of DLD's Data Science at the Singularity, it's trickier to achieve both [FR1: Common Data] and [FR3: Common Benchmarks] since modeling context/structure in OR often entails modifying the data collection itself.
Tweet media one
Tweet media two
Tweet media three
@vishalguptaphd
Vishal Gupta
7 months
@ProfKuangXu The challenge I’ve seen is that many benchmark datasets strip away a lot of problem context — like the miplib library. Without the richness of setting, it’s hard to use them broadly.
2
0
3
2
0
6
@XYHan_
XY Han
2 years
Interesting new paper proposing a NC-inspired loss that mitigates undesirable biases when training deep nets on imbalanced data. Builds upon the prior work of C.Fang, @hangfeng_he , @DrQiLong , & @weijie444 showing minority collapse under imbalanced training. (1/2)
@deep_rl
Deep RL
2 years
Neural Collapse Inspired Attraction-Repulsion-Balanced Loss for Imbalanced Learning - Liang Xie
0
0
2
1
2
6
@XYHan_
XY Han
2 years
Interested in a POST-DOC combining AI/ML and optimization? Prof. Baris Ata ( @ChicagoBooth ) is HIRING! Contact: Link:
Tweet media one
Tweet media two
0
1
6
@XYHan_
XY Han
6 months
The type of creativity you can’t replace (yet?) with AI~ 👏
@tifding
Tiffany Ding
6 months
The greatest accomplishment of my statistics career has been winning this year’s @UCBStatistics T-shirt design competition with a @SFBART -inspired shirt designed w/ @aashen12 ! {stats nerds} ∩ {public transit nerds} ≠ ∅ 📉🚅
Tweet media one
4
9
72
0
0
5
@XYHan_
XY Han
3 months
@alz_zyd_ Uncomfortable part is that a non-negligible part of K-12 teachers’ skillset is memorizing and teaching students to go through the motions. If AI is allowed in K-12 classrooms, it calls into question the entire training of K-12 teachers trained pre-2022. It’s clear the curriculum
0
0
5
@XYHan_
XY Han
2 years
Library of Babel by Luis Borges. Generated using first few lines using #dalle !
Tweet media one
@ccanonne_
Clément Canonne
2 years
Your periodic reminder that, no matter how novel your ideas and groundbreaking your paper, you already have been scooped by Jorge Luis Borges.
7
9
98
1
0
4
@XYHan_
XY Han
7 months
@miniapeur 𝚝𝚘𝚛𝚌𝚑.𝚕𝚒𝚗𝚊𝚕𝚐.𝚒𝚗𝚟(𝙰)
0
0
5
@XYHan_
XY Han
1 year
@DimitrisPapail They say every painting is a self-portrait...
1
0
5
@XYHan_
XY Han
10 months
@schmidtdominik_ @MinqiJiang What software did you use to generate those UMAP plots? They are beautiful…
1
0
5
@XYHan_
XY Han
4 months
Anyone know how to search the #ISMP2024 program? From the site, seems the only way is to click open each individual session or speaker name to see what talks there are for that particular session/person?? @math_opt
0
2
5
@XYHan_
XY Han
1 year
"Watercolor portrayal of a sunset scene where supercomputer mountains are silhouetted against a fiery sky. Streams of luminous data pour from their summits, filling a gleaming lake below that reflects the array of scientific discoveries." #DallE3 🤩
Tweet media one
0
0
5
@XYHan_
XY Han
8 months
@ShiqianMa @sirbayes I thought about it from this angle too: Forward passing on a deep net is expensive. But I couldn't quite convince myself for the following reasons. (1) You have to do function evaluations for SGD too. In fact, linesearch evaluations are better evaluation/compute-wise because you
0
0
5
@XYHan_
XY Han
4 months
@ben_golub 😮 I didn’t know about that. I can only speak to the discrepancy between operation research (engineering) and operations management (business) that draws from the same pool of people. From what you say, a different mechanism does seem at work there… thanks for the insight!
1
0
5
@XYHan_
XY Han
3 years
Sparsity is achieved *without* the familiar l1 penalization. And it simply uses gradient descent on kernel loss without extra “tricks”. Worthwhile read with original ideas and new techniques.
@arxiv_cs_LG
cs.LG Papers
3 years
On the Self-Penalization Phenomenon in Feature Selection. Michael I. Jordan, Keli Liu, and Feng Ruan
1
0
1
1
0
5
@XYHan_
XY Han
4 months
@damekdavis @MountainOfMoon @HDSIUCSD @UCSanDiego Congrats to you both 🥳! I’m now waiting for the singularity event if you two ever end up in the same place. 🤠
0
0
5
@XYHan_
XY Han
3 months
@docmilanfar So that’s why some top players in a field act like asses. 🫏
1
0
5
@XYHan_
XY Han
7 months
@Ian_yzhu How long before you become the person who prints PDF, handmarks them, and then emails low-res scanned copy back? 😏
0
0
4
@XYHan_
XY Han
27 days
@xwang_lk There’s @CPALconf that will be in Stanford next year~
0
0
4
@XYHan_
XY Han
7 months
“I have a truly marvelous demonstration of 1M token context which this margin is too narrow to contain.”
@IntuitMachine
Carlos E. Perez
8 months
Demis Hassabis admits that Google has some secret sauce in how Gemini is able to process 1-10m token context windows. The extreme context length in Gemini 1.5 Pro "can't" be achieved "without some new Innovations". This is an astonishing development that seems to hint at
15
73
368
0
0
3
@XYHan_
XY Han
1 year
[Then:] Read the final, published paper. It’s the most polished! [Now:] Find the preprint. It’s formatted as the authors wanted without copyediting artifacts.
1
0
4