Simultaneous Missing Value Imputation and Causal Discovery with Groups

Pablo Morales-Alvarez, Angus Lamb, Simon Woodhead, Simon Peyton Jones, Miltiadis Allamanis, Cheng Zhang. NeurIPS 2022

TLDR: Causal discovery and missing value imputation via GNNs.

Missing values constitute an important challenge in real-world machine learning for both prediction and causal discovery tasks. However, existing imputation methods are agnostic to causality, while only few methods in traditional causal discovery can handle missing data in an efficient way. In this work we propose VICause, a novel approach to simultaneously tackle missing value imputation and causal discovery efficiently with deep learning. Particularly, we propose a generative model with a structured latent space and a graph neural network-based architecture, scaling to large number of variables. Moreover, our method can discover relationships between groups of variables which is useful in many real-world applications. VICause shows improved performance compared to popular and recent approaches in both missing value imputation and causal discovery.