Bayesian Causal Inference in the Presence of Structural Uncertainty
- Status
- In work
- Student
- Christian Toth
- Mentors
- Research Areas
Few topics in science and philosophy have been as controversial as the nature of causality. Interestingly, the discussion becomes relatively benign, from a philosophical perspective, as soon as one agrees on a well-defined mathematical model of causality, such as Pearl’s structural causal model (SCM). Assuming that the data comes from some model within a considered class of SCMs, causal questions reduce, in principle, to epistemic questions, i.e., questions about what and how much is known about the model.
In principle, knowing “just” the causal structure of the true model, e.g., provided in the form of a directed acyclic graph (DAG), already permits the identification and estimation of causal quantities. When lacking such structural knowledge, one typically infers a single causal structure, which is then used to estimate the desired causal quantities (e.g., an average treatment effect). However, committing to a single model neglects any epistemic model uncertainty stemming from the finite amount and/or quality of available data. This is problematic, as a mismatch between the inferred causal structure and the true structure may severely affect the quality and truthfulness of the subsequent causal estimates. Reflecting the epistemic uncertainty about the underlying causal structure in downstream causal estimates is thus of central importance.
These considerations naturally invite a Bayesian treatment, i.e., specifying a prior distribution over entire SCMs (including causal structure, mechanisms and exogenous variables) and a likelihood model to infer the posterior over SCMs given collected data. Bayesian causal inference (BCI) then naturally incorporates epistemic uncertainty about the true causal model into downstream causal estimates via Bayesian marginalisation (posterior averaging) over all causal models: the causal estimate of each model is weighted with the models’ posterior score. Although BCI is conceptually appealing and principled, in practice, it becomes computationally intractable even for small problem instances due to the prohibitive number of possible causal structures to marginalise over.
In this thesis, we explore the feasibility of Bayesian causal inference in non-linear models given sparse data. We aim to devise frameworks and tractable algorithms applicable to small- and medium-sized systems of up to a few hundred variables.
