Fang Han @johnleibniz profile

Fang Han

@johnleibniz

Followers

912

Following

191

Media

17

Statuses

239

https://t.co/03ZUMkcgfA

Joined October 2010

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

Labour • 585825 Tweets

Epstein • 357129 Tweets

Project 2025 • 268982 Tweets

4th of July • 177137 Tweets

Mimi • 93862 Tweets

Latinus • 85391 Tweets

Lana • 76527 Tweets

Bill Gates • 52838 Tweets

Independence Day • 46679 Tweets

#Wimbledon • 42878 Tweets

Jerry • 41513 Tweets

Blade • 40334 Tweets

#WeBackBidenHarris • 35607 Tweets

Keith Lee • 33129 Tweets

Poseidón • 30949 Tweets

Fourth of July • 29265 Tweets

Duale • 22867 Tweets

Mark Cavendish • 22600 Tweets

Quavo • 19812 Tweets

#faceABFM • 14696 Tweets

Juan Fernando Cristo • 12186 Tweets

Berrettini • 10066 Tweets

Matilda

James Bond

Milli İstihbarat Akademisi

España 82

Fabbri

México 86

Don Ciccio

Taurean Prince

Tarasenko

Moretti

Orsini

FGTS

Darvin Ham

Milton Giménez

Hava Harekatı

Alemania 2006

Alec Burks

احمد سعد

Italia 90

Konnor

Reali

Francia 98

Moziah

San Lorenzo

Neil Gaiman

Altcoin Üstadı

#chilhavisto

#بيت_السعد

Last Seen Profiles

@PlayCentury

@oyakkan_

@SqlWorldWide

@elc4pone

@WanteD_Papy

@pabomiaw

@CoreyBGross

@colbyda1

@carogloom

@nextoyourdream

@puppies_video

@theosatlantic

@SctSyndications

@aemondbaby

@AranLewisDH

@manga_nico

@LuxTax_

@LoaskeMellow

@s_dgn_41

@SoCreeps

Fang Han

@johnleibniz

3 years

Not a junior researcher fighting for the space of the Annals of Statistics any more… but hey, every acceptance still deserves a new “bound”.

4

2

180

Fang Han

@johnleibniz

3 months

Wow, impressive!!! My morning attempt using martingale theory only...

Simon Coste ꙮ

@__SimonCoste__

3 months

I only knew one proof of the DKW inequality and it's not easy at all ! Nice achievement

0

10

60

0

15

165

Fang Han

@johnleibniz

1 year

Look at what I found on arXiv today 🤩 ⁦ @pengding00 ⁩

A First Course in Causal Inference

I developed the lecture notes based on my ``Causal Inference'' course at the University of California Berkeley over the past seven years. Since half of the students were undergraduates, my lecture...

arxiv.org

0

29

165

Fang Han

@johnleibniz

1 month

I put it in my book draft:

Sam Power

@sp_monte_carlo

1 month

Baller (from )

16

34

344

3

11

135

Fang Han

@johnleibniz

4 months

My favorite quote of Talagrand (Probability in Banach Spaces, with Ledoux). Talagrand has the magic to turn complex things simple.

Fang Han

@johnleibniz

4 months

The real challenge in VC type theorems, which is usually hidden in learning theory, is the measurability requirement on the studied objects. This is a long-forsaken bug that one you noticed, is hard to get away from.

0

2

21

1

9

68

Fang Han

@johnleibniz

8 months

This is a fun collaboration, for which I learnt a lot from two coauthors Peng ( @pengding00 ) and Zhexiao ( @zzzxlin ). A special shout-out to Zhexiao, who just started his 2nd year PhD study (!).

Econometrica

@ecmaEditors

8 months

We revisit the Abadie-Imbens study on nearest neighbor matching and show that, with a diverging number of nearest neighbors, matching estimators can be doubly robust and semiparametrically efficient for estimating the average treatment effect

0

52

199

5

6

67

Fang Han

@johnleibniz

4 years

Hmmm... 3 AoS papers in one issue; what should I say? congrats to myself 🤪? Random matrix theory: Rank correlation: Variance estimation:

1

0

59

Fang Han

@johnleibniz

1 year

Rina, Peng ( @pengding00 ), Nicole, and I are organizing an IMSI workshop aimed at forging connections between causal inference, distribution-free methods, and probability theory, within the overarching theme of "permutation".

Permutation and Causal Inference • IMSI

www.imsi.institute

2

12

58

Fang Han

@johnleibniz

1 year

The dust finally settles: my students Yandi Shen () & Hongjian Shi () will join the stats departments at 𝗖𝗠𝗨 & 𝗪𝗮𝘁𝗲𝗿𝗹𝗼𝗼 as TT ass profs. They're true scholars and I feel fortunate to have worked with them. #AcademicPride

2

4

55

Fang Han

@johnleibniz

6 months

If you keep convoluting the same density function but deliberately keep the mean and variance stable, then it will eventually converge to a distribution with the maximum entropy under that mean/variance constraint. CLT is confirming the second law of thermodynamics.

Gabriel Peyré

@gabrielpeyre

6 months

The central limit theorem equivalently reads as the convergence of iterated convolutions.

21

264

2K

2

1

55

Fang Han

@johnleibniz

2 months

This Le Cam-style paper is going to appear in an upcoming issue of 𝗕𝗶𝗼𝗺𝗲𝘁𝗿𝗶𝗸𝗮. The brilliant first author, Yihui, will join stat @Wharton in the coming year; yes, he is still an undergrad at math @PKU !

arXiv math.ST Statistics Theory

@mathSTb

8 months

Yihui He, Fang Han: On propensity score matching with a diverging number of matches

0

3

14

3

52

Fang Han

@johnleibniz

24 days

This paper is going to appear in a future issue of the Annals of Applied Probability; it gives the limiting distribution of Chatterjee’s correlation when the data are supported over a manifold.

Azadkia-Chatterjee's correlation coefficient adapts to manifold data

In their seminal work, Azadkia and Chatterjee (2021) initiated graph-based methods for measuring variable dependence strength. By appealing to nearest neighbor graphs, they gave an elegant...

arxiv.org

0

4

42

Fang Han

@johnleibniz

6 months

The paper () now will appear in a future issue of 𝗕𝗶𝗼𝗺𝗲𝘁𝗿𝗶𝗸𝗮. The message is one-line: Chatterjee's rank correlation is root-n consistent, asymptotically normal, but 𝗶𝗿𝗿𝗲𝗴𝘂𝗹𝗮𝗿, and hence bootstrap inconsistent.

On the failure of the bootstrap for Chatterjee's rank correlation

While researchers commonly use the bootstrap for statistical inference, many of us have realized that the standard bootstrap, in general, does not work for Chatterjee's rank correlation. In this...

arxiv.org

Fang Han

@johnleibniz

1 year

Ok, the literature review is done: Beran (1997) showed that in LAN models, bootstrap consistency is equivalent to that the estimator is regular; should be a textbook result IMHO.

4

0

28

1

3

38

Fang Han

@johnleibniz

3 years

: Our holiday gift to all fans of matching methods: in strong contrast to the common belief, Abadie and Imbens's bias-corrected nearest neighbor (NN) matching actually already gives a doubly robust and semiparametrically efficient estimator of the ATE.

2

5

39

Fang Han

@johnleibniz

2 years

In 𝗝𝗔𝗦𝗔: nonparametric Poisson mixture, nonparametric MLE, Wasserstein metric, minimax rates, random permutation, ANOVA Motivated by single-cell genomics and approved by the expert @weisun2013 !

Fisher-Pitman Permutation Tests Based on Nonparametric Poisson Mixtures with Application to Single...

This article investigates the theoretical and empirical performance of Fisher-Pitman-type permutation tests for assessing the equality of unknown Poisson mixture distributions. Building on nonparam...

www.tandfonline.com

2

38

Fang Han

@johnleibniz

5 months

This deserves being repeatedly said: Conformal prediction gives marginal instead of conditional coverage, and it rarely works for time series prediction.

Ben Recht

@beenwrekt

5 months

What does conformal prediction actually guarantee? I predict it’s not what you want.

7

14

114

3

4

35

Fang Han

@johnleibniz

2 years

"On the power of Chatterjee's rank correlation" is currently among 𝘁𝗵𝗲 𝗺𝗼𝘀𝘁 𝗿𝗲𝗮𝗱 𝗮𝗿𝘁𝗶𝗰𝗹𝗲𝘀 𝗶𝗻 𝗕𝗶𝗼𝗺𝗲𝘁𝗿𝗶𝗸𝗮; please take a look if interested in measuring functional dependence, Le Cam's method, or rank/permutation statistics.

On the power of Chatterjee’s rank correlation

Summary. Chatterjee (2021) introduced a simple new rank correlation coefficient that has attracted much attention recently. The coefficient has the unusual

academic.oup.com

1

2

34

Fang Han

@johnleibniz

2 years

Accepted to 𝗕𝗲𝗿𝗻𝗼𝘂𝗹𝗹𝗶. This is the first of a series of papers on Azadkia and Chatterjee's brilliant idea. Its real meat is a new CLT that solves their 2nd conjecture. A complete storyline can be found in this slide deck:

arXiv math.ST Statistics Theory

@mathSTb

3 years

Hongjian Shi, Mathias Drton, Fang Han: On Azadkia-Chatterjee's conditional dependence coefficient

1

2

6

30

Fang Han

@johnleibniz

2 years

We just proved that Chatterjee's rank correlation is asymptotically normal as long as one variable is not a measurable function of the other.

Limit theorems of Chatterjee's rank correlation

Establishing the limiting distribution of Chatterjee's rank correlation for a general, possibly non-independent, pair of random variables has been eagerly awaited to many. This paper shows that...

arxiv.org

1

3

28

Fang Han

@johnleibniz

1 year

Ok, the literature review is done: Beran (1997) showed that in LAN models, bootstrap consistency is equivalent to that the estimator is regular; should be a textbook result IMHO.

Fang Han

@johnleibniz

1 year

Peter Bickel just pointed out to me that Hodges's estimator is another example; bootstrap fails then at mu=0 (Beran 1982). It then occurred to me that Andrea Rotnitzky also mentioned that bootstrap consistency is sth about *regular* estimators. Would be nice to see a theory!

0

9

4

0

28

Fang Han

@johnleibniz

3 years

@bbstats @adad8m @octonion @_bakshay distance correlation is more efficient in testing independence, but is not distribution-free, is sensitive to outliers, and cannot capture perfect dependence (i.e., it is not 1 iff Y is a measurable function of X). Check my Bernoulli news article for more:

1

4

26

Fang Han

@johnleibniz

15 days

We now improve the rate to n^{-1/2}, the obviously optimal one.

Fang Han

@johnleibniz

1 year

To appear in 𝗜𝗘𝗘𝗘 𝗧𝗿𝗮𝗻𝘀𝗮𝗰𝘁𝗶𝗼𝗻𝘀 𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻 𝗧𝗵𝗲𝗼𝗿𝘆. Do check Teh and Polyanskiy's follow-up work, which improved the rate of ours from n^{-1/10} to n^{-1/4} via a smart use of Hadamard’s three-circle theorem.

1

0

12

2

25

Fang Han

@johnleibniz

6 months

I should have added a reference to this tweet: the super impressive Artstein-Ball-Barthe-Naor 2004 JAMS paper. Simply elegant and powerful.

Fang Han

@johnleibniz

6 months

If you keep convoluting the same density function but deliberately keep the mean and variance stable, then it will eventually converge to a distribution with the maximum entropy under that mean/variance constraint. CLT is confirming the second law of thermodynamics.

2

1

55

2

23

Fang Han

@johnleibniz

4 months

The real challenge in VC type theorems, which is usually hidden in learning theory, is the measurability requirement on the studied objects. This is a long-forsaken bug that one you noticed, is hard to get away from.

Maxim Raginsky

@mraginsky

4 months

@vpatryshev Some probabilists recognized the importance of that work early on, for example Richard Dudley played a big role in spreading awareness of the VC work in the West.

0

1

12

0

2

21

Fang Han

@johnleibniz

3 years

guess who is a big fan of Leibniz🤪

UW Statistics

@UWStat

3 years

We are proud to announce that two of our core faculty members recently got promoted to new appointments in UW Statistics. Congratulations to Fang Han and Alex Luedtke for their accomplishments! Read about them at: @johnleibniz @aluedtke

0

3

31

3

0

20

Fang Han

@johnleibniz

3 years

New paper in Biometrika. Nothing particularly technically challenging but, hey, only 16-page long yet with Sourav Chatterjee, Holger Dette, Jaroslav Hájek, Wassily Hoeffding, Jack Kiefer, Lucien Le Cam, Murray Rosenblatt, and many other great statisticians in😍

3

0

17

Fang Han

@johnleibniz

2 years

Probably worth mentioning: Chatterjee's correlation coefficient is both asymptotically normal and bootstrap INCONSISTENT. I do not recall seeing a second such example except for some artifacts (Bickel&Freedman, 1981?).

Fang Han

@johnleibniz

2 years

Sourav's comment on Lin and Han (2022, ):

0

4

1

2

16

Fang Han

@johnleibniz

1 year

Here is the proof, executed by the magnificent @zzzxlin : , where discussion with @hagmnn and @molivarego on Twitter was acknowledged (and much appreciated!). Still seeking more examples that are root-n consistent, ASN, but bootstrap inconsistent🤔

On the failure of the bootstrap for Chatterjee's rank correlation

While researchers commonly use the bootstrap for statistical inference, many of us have realized that the standard bootstrap, in general, does not work for Chatterjee's rank correlation. In this...

arxiv.org

Fang Han

@johnleibniz

2 years

Probably worth mentioning: Chatterjee's correlation coefficient is both asymptotically normal and bootstrap INCONSISTENT. I do not recall seeing a second such example except for some artifacts (Bickel&Freedman, 1981?).

1

2

16

3

16

Fang Han

@johnleibniz

7 months

For people that like matching, rank-based statistics, and nonparametric regression with generated covariates🤓 A collaborative work with the awesome Matias and the amazing Zhexiao @zzzxlin 😍

arXiv math.ST Statistics Theory

@mathSTb

7 months

Matias D. Cattaneo, Fang Han, Zhexiao Lin: On Rosenbaum's Rank-based Matching Estimator

0

2

8

0

1

16

Fang Han

@johnleibniz

2 years

Here comes the great master: 𝘔𝘦𝘢𝘴𝘶𝘳𝘦𝘴 𝘰𝘧 𝘪𝘯𝘥𝘦𝘱𝘦𝘯𝘥𝘦𝘯𝘤𝘦 𝘢𝘯𝘥 𝘧𝘶𝘯𝘤𝘵𝘪𝘰𝘯𝘢𝘭 𝘥𝘦𝘱𝘦𝘯𝘥𝘦𝘯𝘤𝘦 by Peter Bickel.

Measures of independence and functional dependence

We follow up on Shi et al's (2020) and Cao's and my (2020) work on the local power of a new test for independence, Chatterjee (2019), and its relation to the local power properties of classical...

arxiv.org

1

3

15

Fang Han

@johnleibniz

28 days

A simpler example is Kendall's tau, for which the U-statistic version is more efficient than the sample-mean version (centralize using the pop mean). However, the same observation does not apply to Pearson's correlation, for which if you know the pop mean, you should use it.

Peng Ding

@pengding00

30 days

IPW with the estimated propensity score is another example. The first-stage estimation reduces the asymptotic variance, which surprises many people. A recent paper is Also, Newey&McFadden chapter 6 is about "two-step estimation"

0

8

74

0

1

14

Fang Han

@johnleibniz

2 years

Handled by the fabulous AE @mraginsky and fresh new in the 𝗜𝗘𝗘𝗘 𝗧𝗿𝗮𝗻𝘀𝗮𝗰𝘁𝗶𝗼𝗻𝘀 𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻 𝗧𝗵𝗲𝗼𝗿𝘆, this paper gives the information limit of general order spline regressions. Very likely the most technical paper I will ever be able to write.

arXiv math.ST Statistics Theory

@mathSTb

4 years

Yandi Shen, Qiyang Han, Fang Han : On a phase transition in general order spline regression

0

2

1

0

13

Fang Han

@johnleibniz

2 years

My take on statistical reasoning (now often rebranded as “causal XXX”): NO magic, just assumptions. Statistical reasoning is, extremely unsatisfactorily to a perfectionist, entirely a play of assumptions.

3

1

12

Fang Han

@johnleibniz

1 year

To appear in 𝗜𝗘𝗘𝗘 𝗧𝗿𝗮𝗻𝘀𝗮𝗰𝘁𝗶𝗼𝗻𝘀 𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻 𝗧𝗵𝗲𝗼𝗿𝘆. Do check Teh and Polyanskiy's follow-up work, which improved the rate of ours from n^{-1/10} to n^{-1/4} via a smart use of Hadamard’s three-circle theorem.

Fang Han

@johnleibniz

3 years

This is a fun project: we found that under the Gaussian-smoothed optimal transport distance, the estimation accuracy of the NPMLEs can be accelerated to a polynomial rate and is in sharp contrast to the \log n-type rates based on the unsmoothed Wasserstein one.

0

6

1

0

12

Fang Han

@johnleibniz

6 months

@sp_monte_carlo @adad8m @gabrielpeyre Just a plug-in estimator using knn. Berrett, Samworth, and Yuan got an impressive paper on this topic

Efficient multivariate entropy estimation via $k$-nearest neighbour distances

Many statistical procedures, including goodness-of-fit tests and methods for independent component analysis, rely critically on the estimation of the entropy of a distribution. In this paper, we seek...

projecteuclid.org

0

11

Fang Han

@johnleibniz

2 years

@daniela_witten @Lizstuartdc @raziehnabi @nataliexdean @BetsyOgburn @analisereal @yayyyates I am still digesting two facts that (a) I am on the same flight with a COPSS winner and (b) she is sitting in the economic but not business class🧐

2

0

11

Fang Han

@johnleibniz

3 years

In case you have not known yet, in this issue there is one short article written by me (!), covering some of our recent efforts to extend rank correlations to higher dimensions. Please enjoy it ☺️

Bernoulli Society

@BernoulliSoc

3 years

Forthcoming issue of Bernoulli News is now available!

0

1

8

1

0

11

Fang Han

@johnleibniz

7 months

Just out!: An interview with Marc Hallin, accurately described in the abstract as "One of the most brilliant mathematical statisticians of our time". He is a role model admired by many of us.

Un entretien avec Marc Hallin

Jean-Jacques Droesbeke [JJD] : Bonjour, Marc. Permets-moi de te présenter en quelques mots pour celles et ceux qui ne te connaissent pas. Tu es professeur émérite à l’Université libre de Bruxelles,...

journals.openedition.org

0

10

Fang Han

@johnleibniz

1 year

Peter Bickel just pointed out to me that Hodges's estimator is another example; bootstrap fails then at mu=0 (Beran 1982). It then occurred to me that Andrea Rotnitzky also mentioned that bootstrap consistency is sth about *regular* estimators. Would be nice to see a theory!

Fang Han

@johnleibniz

1 year

Here is the proof, executed by the magnificent @zzzxlin : , where discussion with @hagmnn and @molivarego on Twitter was acknowledged (and much appreciated!). Still seeking more examples that are root-n consistent, ASN, but bootstrap inconsistent🤔

3

16

0

9

Fang Han

@johnleibniz

2 years

Here comes another great master: 𝗦𝗼𝘂𝗿𝗮𝘃 𝗖𝗵𝗮𝘁𝘁𝗲𝗿𝗷𝗲𝗲 surveys recent developments in measures of association, including his own rank correlation coefficient!

A survey of some recent developments in measures of association

This paper surveys some recent developments in measures of association related to a new coefficient of correlation introduced by the author. A straightforward extension of this coefficient to...

arxiv.org

0

1

9

Fang Han

@johnleibniz

11 months

Although bootstrap could fail, m out of n bootstrap will never. Check Dette and @k2daroll 's recent work on the validity of m out of n bootstrap for inferring Chatterjee's rank correlation. It is an elegant work.

Fang Han

@johnleibniz

1 year

Ok, the literature review is done: Beran (1997) showed that in LAN models, bootstrap consistency is equivalent to that the estimator is regular; should be a textbook result IMHO.

4

0

28

1

8

Fang Han

@johnleibniz

1 year

Both "On the power of Chatterjee’s rank correlation" and "On boosting the power of Chatterjee’s rank correlation" are now among 𝘁𝗵𝗲 𝗺𝗼𝘀𝘁 𝗿𝗲𝗮𝗱 𝗮𝗿𝘁𝗶𝗰𝗹𝗲𝘀 𝗶𝗻 𝗕𝗶𝗼𝗺𝗲𝘁𝗿𝗶𝗸𝗮. Take a look!

On boosting the power of Chatterjee’s rank correlation

Summary. The ingenious approach of Chatterjee (2021) to estimate a measure of dependence first proposed by Dette et al. (2013) based on simple rank statist

academic.oup.com

Fang Han

@johnleibniz

2 years

"On the power of Chatterjee's rank correlation" is currently among 𝘁𝗵𝗲 𝗺𝗼𝘀𝘁 𝗿𝗲𝗮𝗱 𝗮𝗿𝘁𝗶𝗰𝗹𝗲𝘀 𝗶𝗻 𝗕𝗶𝗼𝗺𝗲𝘁𝗿𝗶𝗸𝗮; please take a look if interested in measuring functional dependence, Le Cam's method, or rank/permutation statistics.

1

2

34

0

8

Fang Han

@johnleibniz

2 years

Its sister paper, "On boosting the power of Chatterjee's rank correlation", was also just 𝗮𝗰𝗰𝗲𝗽𝘁𝗲𝗱 𝘁𝗼 𝗕𝗶𝗼𝗺𝗲𝘁𝗿𝗶𝗸𝗮. Key message: scaling up the # of nearest neighbors can boost Chatterjee's power to near parametric.

On boosting the power of Chatterjee's rank correlation

Chatterjee (2021)'s ingenious approach to estimating a measure of dependence first proposed by Dette et al. (2013) based on simple rank statistics has quickly caught attention. This measure of...

arxiv.org

0

8

Fang Han

@johnleibniz

4 years

Yup, that is true; we are hiring

Daniela Witten

@daniela_witten

4 years

The rumors are true.... UW Statistics is really hiring 2 tenure-track assistant professors!! 👩‍🎓👨‍🎓📚 Apply before 10/20 for full consideration! 📅✍️💻 PLEASE RETWEET🔊🔊🔊

3

217

286

0

1

8

Fang Han

@johnleibniz

2 years

Now published in 𝗕𝗶𝗼𝗺𝗲𝘁𝗿𝗶𝗰𝘀, . Short message: 𝐭𝐚𝐤𝐢𝐧𝐠 𝐩𝐚𝐢𝐫𝐰𝐢𝐬𝐞 𝐝𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐜𝐞𝐬 creates symmetry (regardless of how peculiar X is, X-X' is always symmetric around, right?) and helps make statistical methods more robust.

Robust functional principal component analysis via a functional pairwise spatial sign operator

Functional principal component analysis (FPCA) has been widely used to capture major modes of variation and reduce dimensions in functional data analysis. However, standard FPCA based on the sample...

onlinelibrary.wiley.com

arxiv

@arxiv_org

3 years

Robust Functional Principal Component Analysis via Functional Pairwise Spatial Signs.

0

1

4

0

1

7

Fang Han

@johnleibniz

3 years

For the record, ML does not replace nonparametric regression either; it needs tons of misinformation to say so.

Arthur Lewbel

@lewbel

3 years

This is a commonly held belief - that ML replaces econometrics. It’s wrong. ML only replaces nonparametric regression (including its use for forecasting). ML alone doesn’t handle most econometric issues, including identification, causality, endogeneity, selection, and attrition.

6

61

579

1

7

Fang Han

@johnleibniz

1 month

One thing to note is that this bound no longer holds if we replace the population mean by the sample mean in centralization: the latter yields a t-statistic which is of course heavy-tailed.

0

7

Fang Han

@johnleibniz

3 years

Magnificent 🤩

adad8m🦞

@adad8m

3 years

Graphical comparison between the standard (linear) #correlation and the Chatterjee's "rank correlation" recently introduced in #statistics #probability @johnleibniz @_bakshay

15

233

928

0

7

Fang Han

@johnleibniz

3 years

so proud of Chao, best collaborator ever 😊

IMS

@InstMathStat

3 years

Dr. Chao Gao, University of Chicago, receives the 2021 IMS Tweedie Award “for groundbreaking contributions to robust statistics, including establishing connections with generative adversarial networks, network analysis, and high-dimensional statistical inference.”

1

19

113

0

7

Fang Han

@johnleibniz

3 years

@daniela_witten Compared to this: presenting in a median-sized workshop when the 3-year old walked in, dancing around, and insisting on your help to pee. True story 🤦‍♂️

0

6

Fang Han

@johnleibniz

4 years

New in the 𝗔𝗻𝗻𝗮𝗹𝘀 𝗼𝗳 𝗦𝘁𝗮𝘁𝗶𝘀𝘁𝗶𝗰𝘀: shape constraint helps to estimate a piecewise constant (PC) signal. By-product: the first linear-time algorithm to compute ALL isotonic PC fits.

2

0

6

Fang Han

@johnleibniz

3 years

This is a fun project: we found that under the Gaussian-smoothed optimal transport distance, the estimation accuracy of the NPMLEs can be accelerated to a polynomial rate and is in sharp contrast to the \log n-type rates based on the unsmoothed Wasserstein one.

arXiv math.ST Statistics Theory

@mathSTb

3 years

Fang Han, Zhen Miao, Yandi Shen: Nonparametric mixture MLEs under Gaussian-smoothed optimal transport distance

1

2

0

6

Fang Han

@johnleibniz

3 years

I am 100% endorsing this tweet... looking at the weather forecast for the next 10 days, I think I am depressed. (kidding. Seattle is super awesome. Please apply; @UWStat has a teaching prof position!)

Daniela Witten

@daniela_witten

3 years

Conversation with my husband…. #Seattle 😆 😢 😭

4

0

21

0

1

6

Fang Han

@johnleibniz

2 years

@sp_monte_carlo Ask the right question, which is usually just about some small-dim functional but absolutely not the whole distribution.

1

6

Fang Han

@johnleibniz

12 days

@aryehazan @mraginsky

A new coefficient of correlation

Is it possible to define a coefficient of correlation which is (a) as simple as the classical coefficients like Pearson's correlation or Spearman's correlation, and yet (b) consistently estimates...

arxiv.org

0

5

Fang Han

@johnleibniz

2 years

Sourav's comment on Lin and Han (2022, ):

Fang Han

@johnleibniz

2 years

Here comes another great master: 𝗦𝗼𝘂𝗿𝗮𝘃 𝗖𝗵𝗮𝘁𝘁𝗲𝗿𝗷𝗲𝗲 surveys recent developments in measures of association, including his own rank correlation coefficient!

0

1

9

0

4

Fang Han

@johnleibniz

1 year

@kareem_carr Sample variance is a U-statistic of the kernel K(x,y)=(x-y)^2/2.

0

5

Fang Han

@johnleibniz

2 years

@maxhfarrell @paulgp @jt_kerwin @causalinf @marcfbellemare @instrumenthull @pedrohcgs You are exactly right, Max! However, there is no rigorous proof about this claim until this recent paper (?). Any comment would be greatly appreciated!

0

3

5

Fang Han

@johnleibniz

1 year

What a list!!! @zzzxlin first meta and now Jane street, wow🤩

Yaron (Ron) Minsky

@yminsky

1 year

Exciting news! Jane Street has announced the winner's of its first Graduate Research Fellowship: It was a great process, and we were all deeply impressed with the quality of the applicants.

3

94

1

5

Fang Han

@johnleibniz

3 years

@ProfRachelGaN simple; do them in weekdays😄

1

0

5

Fang Han

@johnleibniz

7 months

@AngYu_soci perhaps also worth noting: replacing a PS estimate by the *true* ps also leads to an inefficient ATE estimator; IMO this is one of the best results ever obtained in mathematical statistics.

0

4

Fang Han

@johnleibniz

2 years

@ben_golub Conditional probability (and conditional expectation) is a monster and better left to the second half (when students are expecting materials to be challenging).

0

4

Fang Han

@johnleibniz

4 years

Fresh in 𝐉𝐀𝐒𝐀, with Mathias Drton @TUM_Mathematics and my student Hongjian Shi @UW . Statistical inference built on optimal-transport-induced multivariate ranks. Distribution-freeness and test consistency are simultaneously achieved.

1

0

4

Fang Han

@johnleibniz

3 years

@justapc @adad8m @_bakshay It is a consistent measure of dependence, namely, it is 0 iff X is independent of Y. However, Chatterjee’s correlation is not uniformly consistent, namely, problems may occur along a special curve.

0

4

Fang Han

@johnleibniz

4 years

@daniela_witten I thought it is model-based inference under the umbrella of (categorical) time series prediction, i.e., we observe X_t (t<=T) and accordingly build a CI for X_{T+1}. For a simple Markov chain model with three states (sun, rain, cloudy), 30% is the transition prob.

1

0

4

Fang Han

@johnleibniz

3 years

@PreetumNakkiran @avshrikumar Check also this paper (of mine) for some most recent progress on Chatterjee’s rank correlation: . A literature has been quickly built up in the past two years.

1

0

4

Fang Han

@johnleibniz

2 years

@wu_biostat (1) Permutation uses the structure of the null hypothesis and is thus more "efficient" (Le Cam, Hajak, Bickel, Hallin). (2) Permutation usually can give "uniformly valid" inference bounds (easy?). (3) Permutation is more robust (Romano).

1

0

4

Fang Han

@johnleibniz

3 years

If you enjoyed reading our previous Biometrika paper (), you may also like to read , where we overcome the power loss of Chattejee's original proposal by scaling up # of nearest neighbours! Technically quite challenging this time.

On boosting the power of Chatterjee's rank correlation

Chatterjee (2021)'s ingenious approach to estimating a measure of dependence first proposed by Dette et al. (2013) based on simple rank statistics has quickly caught attention. This measure of...

arxiv.org

Fang Han

@johnleibniz

3 years

New paper in Biometrika. Nothing particularly technically challenging but, hey, only 16-page long yet with Sourav Chatterjee, Holger Dette, Jaroslav Hájek, Wassily Hoeffding, Jack Kiefer, Lucien Le Cam, Murray Rosenblatt, and many other great statisticians in😍

3

0

17

1

0

4

Fang Han

@johnleibniz

4 years

@daniela_witten Hmm... so can we interpret “percent chance of rain” as the conditional probability of “tomorrow is rainy” given everything obtainable until now?

0

4

Fang Han

@johnleibniz

2 years

@ben_golub In the beginning there were Pascal and Fermat, and de Moivre. They debated and debated. Then came Kolmogorov, who taught measure theory to all. Alas, we now know the conditional probability is just a Markov kernel definable over a Polish space, but does anyone still care?

0

1

4

Fang Han

@johnleibniz

1 year

@itsbradross @jiafengkevinc @jt_kerwin The exact argument was made in Abadie and Imbens' (2016, ECMA) paper, where they showed that the Donsker-type condition in the original matching outcome model can be substantially weakened by using the propensity score; at the cost of efficiency, though.

1

0

4

Fang Han

@johnleibniz

7 months

@AngYu_soci The high-level reason is simple: simple parametric model eliminates too much information, while a nonparametric reg with a *right* order keeps a good balance between estimation accuracy and info reservation.

1

0

4

Fang Han

@johnleibniz

9 months

It is rare to encounter such a degree of candor on the internet.

1

0

3

Fang Han

@johnleibniz

2 years

@paulgp @maxhfarrell @jt_kerwin @causalinf @marcfbellemare @instrumenthull @pedrohcgs Check the thread here: In short, biased-corrected matching is doing double machine learning.

Fang Han

@johnleibniz

3 years

: Our holiday gift to all fans of matching methods: in strong contrast to the common belief, Abadie and Imbens's bias-corrected nearest neighbor (NN) matching actually already gives a doubly robust and semiparametrically efficient estimator of the ATE.

2

5

39

0

2

4

Fang Han

@johnleibniz

2 years

@hagmnn My conjecture is that any statistic that is not asymptotically linear (=\sum g(X_i)+small order terms) will have trouble. And it just occurs to me that Abadie&Imbens 2008 is another example (ASN but bootstrap inconsistent).

2

0

3

Fang Han

@johnleibniz

2 years

freely accessible version:

Fisher-Pitman permutation tests based on nonparametric Poisson...

This paper investigates the theoretical and empirical performance of Fisher-Pitman-type permutation tests for assessing the equality of unknown Poisson mixture distributions. Building on...

arxiv.org

0

1

3

Fang Han

@johnleibniz

3 years

@PreetumNakkiran @avshrikumar It does not matter; as long as the model is “regular”, Chatterjee’s correlation is inefficient. Chatterjee’s heuristic does not work mathematically; cf. the Biometrika paper (of mine) mentioned above.

0

3

Fang Han

@johnleibniz

2 years

@jiafengkevinc OMG, are we going to teach measure theory on Twitter now???😍😍😍

0

3

Fang Han

@johnleibniz

3 years

@ThosVarley @adad8m @_bakshay Chatterjee correlation comes from the law of total variance; a Chatterjee correlation of value 0.5 means that averagely half of Var(I(Y>.)) can be explained by that conditional on X.

1

0

3

Fang Han

@johnleibniz

1 year

Teh and Polyanskiy's paper:

Comparing Poisson and Gaussian channels (extended)

Consider a pair of input distributions which after passing through a Poisson channel become $ε$-close in total variation. We show that they must necessarily then be...

arxiv.org

0

3

Fang Han

@johnleibniz

3 years

@vadimZip @jhubiostat Too shy to reply anything meaningful🤣

0

3

Fang Han

@johnleibniz

1 year

@LarsvanderLaan3 Yup, I was just talking about this with Jon Wellner several days ago and was set to prove it myself. But if course, Beran got it 25 years ahead me : )

1

0

3

Fang Han

@johnleibniz

2 years

@UnibusPluram @kareem_carr (1) Yes, working models are useful, and some (e.g., normal regression that leads to OLS) are super robust; (2) Parametric models can make sen from time to time; (3) nonparametric methods have their own problems, e.g., tuning parameters is a rabbit hole.

1

0

3

Fang Han

@johnleibniz

2 years

In this department, we use it for everything that involves a faculty vote; it works like a charm.

0

3

Fang Han

@johnleibniz

2 years

@pedrohcgs @maxhfarrell Check the thread I made here: In short, biased-corrected matching is doing double machine learning.

Fang Han

@johnleibniz

3 years

: Our holiday gift to all fans of matching methods: in strong contrast to the common belief, Abadie and Imbens's bias-corrected nearest neighbor (NN) matching actually already gives a doubly robust and semiparametrically efficient estimator of the ATE.

2

5

39

0

3

Fang Han

@johnleibniz

4 months

@Apoorva__Lal @ben_golub

0

2

Fang Han

@johnleibniz

6 months

@_bakshay Can be traced back to Shannon and Lieb; check the 2nd paragraph of .

0

3

Fang Han

@johnleibniz

4 years

@analisereal @daniela_witten @tdietterich @lucylgao Define a_{k,1}, .., a_{k,k} to be the k "optimal centre points" such that M_k:=E{\min_j \| X - a_{k,j} \|^2} is minimized. My interpretation of the null hypothesis is H_0: M_k=M_{k-1}.

0

3

Fang Han

@johnleibniz

1 year

@zzzxlin , still a first-year PhD student, but already a finalist for Meta PhD fellowship🤩

Nektarios Leontiadis

@neksec

1 year

Meta Research PhD Fellowship award winners for 2023 have been announced! Congratulations to all these bright minds!

0

9

1

0

3

Fang Han

@johnleibniz

4 years

@daniela_witten Oh yes, there is actually a widespread campus legend saying that Padelford was designed to be riot-proof; check here

1

0

3

Fang Han

@johnleibniz

2 years

@maxhfarrell @BruceEHansen 3x easier, next

1

0

3

Fang Han

@johnleibniz

7 months

@Apoorva__Lal @pengding00 @zzzxlin The paper suggests a constant/rate choice for M based on the minimax theory and simulations. No idea about how to relate it to OT, though😃

0

2

Fang Han

@johnleibniz

2 years

@daniela_witten a landmark paper

0

2

Fang Han

@johnleibniz

8 months

@molivarego Nice! @zzzxlin

0

2

Fang Han

@johnleibniz

2 years

This is the tweet of the month.

Nikhil Basavappa

@NikhilBasavappa

2 years

i don’t know how to explain this but this is IV regression

15

176

2K

0

2

Fang Han

@johnleibniz

2 months

@ben_golub He is still an undergraduate, Ben!

0

1

Fang Han

@johnleibniz

3 years

(b) The term K_M(i)/M above, introduced in Abadie and Imbens (2006), constitutes a consistent and efficient density ratio estimator. Even better, it is one-step, of near-linear computation complexity, and is statistically optimal, the first one to gain all three simultaneously.

1

0

2

Fang Han

@johnleibniz

1 year

It is an amazing gathering! Thanks for putting out such a great program, Fabrizio 🤩

Fabrizio Durante

@fbdurante

1 year

The 40th Linz Seminar on " #Copulas - Theory and Applications" ended today. Thanks to all participants and to invited speakers for brilliant talks and discussions. Many problems have been solved. Many others are still open!

1

0

5

0

2

Fang Han

@johnleibniz

2 years

@Apoorva__Lal check Section 3 in

0

2

Fang Han

@johnleibniz

1 month

@ChengGao12 I hope to post the first 8 chapters online in a month🤪

1

0

2