New preprint: Do Finetti w/
@zcccucla
,
@Carthica
,
@fhuszar
,
@bschoelkopf
and me.
Do Finetti provides a do-calculus foundation for exchangeable data following the independent causal mechanism (ICM) principle + a causal Pólya urn model to show how
Want to do a PhD in ML, consider applying to the fully-funded Cambridge - Tübingen PhD Fellowship.
From personal experience supervised by
@fhuszar
and
@bschoelkopf
, this is the best PhD out there!
Deadline: noon, Dec 5th.
New preprint: Do Finetti w/
@zcccucla
,
@Carthica
,
@fhuszar
,
@bschoelkopf
and me.
Do Finetti provides a do-calculus foundation for exchangeable data following the independent causal mechanism (ICM) principle + a causal Pólya urn model to show how
Generalization is more than distribution shifts.
work w/
@WildbergerJonas
,
@bschoelkopf
,
We proposed out-of-variable (OOV) generalization to study knowledge transfer with partial observability. We showed learning from the residual uncovers unobserved
Will be talking about our recent work Causal de Finetti () in London on November 4th, 12pm midday UK time at
@uclcsml
.
Feel free to reach out if you are around and have a chat about Causal ML!
Details can be found here:
I will be attending
#NeurIPS2023
and present causal de Finetti @ 14 Dec 10:45 - 12:45. DM if you want to chat about causality, AI for science and RAG!
We provably show that properties of input data affect models' downstream properties e.g. compositionality, disentanglement.
preprint by
@syguoML
w/ Viktor Tóth,
@bschoelkopf
&me
New causal de Finetti theorems provide a probabilistic foundation for the independent causal mechanisms (ICM) principle + a principled way to infer invariant causal structure from exchangeable data
🧵👇
Causal de Finetti had made it to
#NeurIPS2023
! Shout out to my amazing supervisors and collaborators. Very excited for the field of causal exchangeability.
Meet you all in New Orleans!
preprint by
@syguoML
w/ Viktor Tóth,
@bschoelkopf
&me
New causal de Finetti theorems provide a probabilistic foundation for the independent causal mechanisms (ICM) principle + a principled way to infer invariant causal structure from exchangeable data
🧵👇
Beyond excitement to receive this award - thank you for people who believed in me and recognised my work! Thank you to the organisers for creating this award and your efforts in promoting gender equality in
#research
community! Particularly, my advisors
@bschoelkopf
and
@fhuszar
Congratulations to Siyuan Guo
@syguoML
from the Empirical Inference department for winning our Institute’s Outstanding Female Doctoral Student Prize! 🏆 It honors one exceptional Ph.D. student each year for her scientific achievements & contributions to the
#research
community 👏
Thank you
@Dagophile
for dropping everything and giving us a summary of the
@syguoML
etal paper. Following your example, I'm going to drop everything and read your summary.
Thank you
@AleksanderMolak
for the nice intro! Indeed, we view data heterogeneity, especially the regime of exchangeable but not i.i.d. data, as an opportunity for causality.
Our work shows exchangeable data has extra conditional independence structures that i.i.d. data lacks -
Started to think more about the topic of causal digital twins by organizing this small workshop at ELLIS unconference with Andrei,
@lawrennd
and
@bschoelkopf
. What is causal? And why digital twins? Looking forward to future works coming from this community :)
It was the first
#unconference
for
#ELLISforEurope
: Scientists from our network met in Spain to discuss the latest cutting-edge
#ML
research during a fully participant-driven event which resulted in many ideas for future collaborations!
#AI
In Vienna
#ICLR2024
to present OOV Generalization for discriminative models.
🗓️Thu 9 May 4:30 p.m.
📍Hall B
#190
Looking forward to chat about Causality & AI for Science!! Drop me a message :D
Generalization is more than distribution shifts.
work w/
@WildbergerJonas
,
@bschoelkopf
,
We proposed out-of-variable (OOV) generalization to study knowledge transfer with partial observability. We showed learning from the residual uncovers unobserved
I am grateful for the opportunity to present our work on Identifiable Exchangeable Mechanisms at the
@Mila_Quebec
tea talks this Friday at 10:30 EDT.
Joint work with
@syguoML
,
@fhuszar
,
@wielandbr
,
@bschoelkopf
.
Preprint: (v2 is up tomorrow)
I'm at
#NeurIPS2022
! Looking forward to in-person chats.
Will be presenting new work (Pragmatic Fairness) at the AFCP workshop on Saturday Dec 3rd.
Till then, which
#CausaML
and
#TrustworthyML
works should I check out? Looking for recs!
This Wed. 10:30 ET (22nd Nov.) at the CDG we are honored to have
@zcccucla
discussing their
@NeurIPSConf
paper: Causal Inference w Non-IID Data using Linear Graphical Models ()
All info & links @ 🌿
cc:
@yudapearl
@Carthica
DAY 0 at NeurIPS: cancels hotel reservation at last minute after flying across half the globe. Now searching somewhere to live... true NeurIPS experience +1
Side story: This paper was first submitted in PNAS and got rejected. Total span between the preprint and publication is a year. Thought would be interesting to remind the long (hidden) story of an accepted paper in the midst of NeurIPS happiness :p
#phdlife
preprint by
@syguoML
w/ Viktor Tóth,
@bschoelkopf
&me
New causal de Finetti theorems provide a probabilistic foundation for the independent causal mechanisms (ICM) principle + a principled way to infer invariant causal structure from exchangeable data
🧵👇
@zcccucla
@Carthica
@fhuszar
@bschoelkopf
[3/n] We proved a generalized truncated factorization that shows causal effects are identifiable in ICM exchangeable processes. Traditional truncated factorization is a special case here, just as how i.i.d. is a special case of exchangeable.
Very exciting to see Goutham Rajendran,
@rpatrik96
, Pradeep Ravi Kumar,
@wielandbr
connecting causal de Finetti to dynamical systems.
We believe humans discover causal relationships through exploration in time, and we need to understand how.
There is a bonus insight!
Hidden Markov Models (LTI systems are HMMs) are used to represent causal relationships. We show that learning HMMs follows the blueprint of the Causal de Finetti (CdF) theorem by
@syguoML
, Viktor Tóth,
@fhuszar
,
@bschoelkopf
.
[5/n] We introduce the causal Pólya urn model:
Suppose you are an observer in front of a black-box. You observe Xn, Yn at time step n. If Xn = 1, you put a red ball in the left-hand side of the black box, and else a green ball. Similarly, you compute a value Zn := (1-Xn)*Yn +
[6/n] This game-like urn model says (Xn, Yn) satisfies the causal de Finetti conditions. Further, in the hidden (unobserved) world, our observations are indeed driven by two independent mechanisms (left and right urn). And us, observers can deduce such hidden mechanisms through
[9/9] We are excited to explore the complexities of non-i.i.d. data. And there is much more to do: from intervention to counterfactual, from Markovian to semi-Markovian. We see a whole world of possibilities for the general causal community in exchangeable but not i.i.d. data. I
[7/n] Observing more Xm = 1, Ym = 0 (m<n) means it is more likely to observe Xn = 1, Yn = 0. Conditional interventional manifests as when an intervention is performed do(Xn=0), one can deduce that it is more likely Yn = 1.
[8/n] Generalizing causality to an exchangeable non-i.i.d. setting does not mean less ability to perform graphical identification and effect estimation. In fact, with an unknown graph, ICM generative processes allow one to identify graph and causal effects simultaneously (cf.
@zcccucla
@Carthica
@fhuszar
@bschoelkopf
[2/n] Do-calculus is based on structural causal models. However, SCMs fail to characterize ICM exchangeable settings. We need a new definition of what intervention means.
[4/n] Causal effects in ICM exchangeable processes have non-trivial conditional interventional distributions. This property does not exist in i.i.d. data. See Block B in Figure 1 and we demonstrate this in the causal Pólya urn model below.
@WildbergerJonas
@bschoelkopf
Take a simple example: Markov factorization is composed of causal Markov kernels though one does not need to observe and measure all variables of interest to recover the joint distribution. Rather, observing only variables in the Markov kernel per environment is sufficient to
Day 1 @ NeurIPS 2023:
@lindensli
's talk is fantastic, do check out the slides once available! It explains with such clarity the tools used daily for LLM inference speed-up and its underlying principles, e.g., tensor parallelism, vllm.
Highlights are:
1. LLM inference on GPU
I’ll be giving a talk tomorrow at NeurIPS about the fundamentals of LLM inference. The talk will start by developing a first principles, systems-approach to reasoning about the inference workload and conclude with a survey of the current state of the art. Some concepts covered:
@laurence_ai
@fhuszar
@bschoelkopf
Great insight! This is exactly how we looked at it. Our theorem formally shows that if exchangeable and satisfy additional CI then we can always represent our data like the figure (theta and psi are independent).
We study a seemingly impossible toy example in this work where Y is the effect of three independent causes under ANM. Our goal is to learn more about the predictive function in the target domain using source domain and marginal information about target covariates only.
The key insight is to learn from the residual distribution. We show the moments of the residual distribution are composed of the moments of the target covariates and the partial derivative with respect to the target covariate (hidden to us in the source domain).
@analisereal
Thanks for your question! This work studies causal effect estimation under different data-generating processes. In general, causal effect is identifiable in Markovian models for i.i.d. but here Fig. 3 aims to show causal effect has extra properties in exchangeable but not i.i.d.