Graduate Exam Abstract
Savini SamarasinghePh.D. Final
February 13, 2020, 3:30 pm - 5:30 pm
ECE Conference Room (C101B)
Causal Inference Using Observational Data - Case Studies in Climate Science
Abstract: We are in an era where atmospheric science is data-rich in both observations (e.g., satellite/sensor data) and model output. Our goal with causal discovery is to apply suitable data science approaches to climate data to identify potential cause-effect relationships between climate variables without using any experimental controls (interventions). The traditional, experimental approach to attribute cause-effect relationships in atmospheric science is based on targeted modeling studies. These modeling studies can be challenging to set up, their reliability depends on the dynamical model’s ability to simulate the processes of interest, and they often involve studying cause and effect in isolation of the system feedback. Resorting to data-driven causal inference methods in such situations can allow climate scientists to gain useful additional insights related to climate dynamics.
There are many different causal inference frameworks. Graphical-Granger methods, constraint-based structure-learning methods, Gaussian graphical methods, and information-theoretic approaches are only a few examples. With a plethora of alternative procedures and frameworks to infer causal relationships, an investigator interested in understanding causality in a spatiotemporal setting likely comes across many questions and challenges. 'What framework is most suitable for the research question? How do you set up the inference problem in a physically meaningful way? How do you select the model variables? How should you preprocess variables to extract the strongest causal signals in the presence of noise?' are few such questions. An additional limitation of most inference frameworks is that they assume that no hidden common causes are acting upon the variables included in the model. This assumption is often violated in climate applications resulting in the standard methods producing spurious/incorrect results. Furthermore, it can be challenging to identify suitable inference methods that are scalable to high-dimensional problems, especially when the data is non-Gaussian, and the relationships between the variables are non-linear.
This research analyzes a few causal inference questions in climate science, case-by-case, and documents the scientific thought process of setting up the problem, the challenges faced, how the challenges are dealt with, and - in the process - also generate new scientific findings of interest to the climate science community. The main objective of this research is to make causal inference methods more accessible to a researcher/climate scientist who is at entry-level to spatiotemporal causality. This can also help promote more modern causal inference methods to the climate science community, moving ahead from traditional approaches such as correlation analysis, and Granger analysis that is based on bivariate linear lagged regression. The scope of this research is limited methods based on probabilistic graphical models, specifically to constraint-based structure-learning methods, and graphical Granger methods. As the case studies, we look into (1) the causal relationships between the Arctic temperature and mid-latitude circulations, (2) relationships between the Madden Julian Oscillation (MJO) and the North Atlantic Oscillation (NAO) and (3) the causal interactions between atmospheric disturbances of different spatial scales (e.g., Planetary vs. Synoptic). These case studies, covering a wide range of questions and challenges are meant to act as a resourceful starting point to a researcher interested in tackling more general causal inference problems in climate.
Adviser: Dr. Imme Ebert-Uphoff
Non-ECE Member: Dr. Michael Kirby, MATH
Member 3: Dr. Edwin Chong, ECE
Addional Members: Dr. Chuck Anderson, CS
Student Publications: Peer-Reviewed Journals/Articles
(1) Samarasinghe, S.M., Deng, Y. and Ebert-Uphoff, I. (2019). A Causality-Based View of the Interaction between Synoptic- and Planetary-Scale Atmospheric Disturbances. Journal of the Atmospheric Sciences.
(2) Barnes, E.A., Samarasinghe, S.M., Ebert-Uphoff, I. and Jason, F. (2019). Tropospheric and Stratospheric Causal Pathways between the MJO and NAO. Journal of Geophysical Research: Atmospheres, 124, 9356-9371.
(3) Samarasinghe, S.M., McGraw, M.C., Barnes, E.A. and Ebert-Uphoff I. (2018). A Study of Links between the Arctic and the Midlatitude Jet Stream Using Granger and Pearl Causality. Environmetrics.
(4) Ebert-Uphoff, I., Samarasinghe, S.M. and Barnes, E.A. (2019). Thoughtfully Using Artificial Intelligence in Earth Science. EOS Earth and Space Science News, 100.
(5) Baker, A.H., Hammerling, D.M., Mickelson, S.A., Xu, H., Stolpe, M.B., Naveau, P., Sanderson, B., Ebert-Uphoff, I., Samarasinghe, S., De Simone, F., Carbone, F., Gencarelli, C.N., Dennis, J.M., Kay, J.E., and Lindstrom, P. (2016). Evaluating Lossy Data Compression on Climate Simulation Data within a Large Ensemble, Geoscientific Model Development, 9, 4381-4403.
(1) Samarasinghe, S., Barnes, E.A and Ebert-Uphoff, I. (2018). Causal Discovery in the Presence of Latent Variables for Climate Science. Proceedings of the 8th International Workshop on Climate Informatics, NCAR Technical Note NCAR/TN-550+PROC.
(2) Ramsey, J., Zhang, K., Glymour, M., Sanchez-Romero, R., Huang, B., Ebert-Uphoff, I., Samarasinghe, S., Barnes, E. and Glymour, C. (2018). TETRAD - A Toolbox for Causal Discovery. Proceedings of the 8th International Workshop on Climate Informatics, NCAR Technical Note NCAR/TN-550+PROC.
(3) Samarasinghe, S.M., McGraw, M.C., Barnes, E.A. and Ebert-Uphoff, I. (2017). A Study of Causal Links between the Arctic and the Midlatitude Jet-Streams. Proceedings of the 7th International Workshop on Climate Informatics, NCAR Technical Note NCAR/TN-536+PROC.
(4) Samarasinghe, S., Deng, Y. and Ebert-Uphoff, I. (2017). Structure Learning in Spectral Space with Applications in Climate Science. Workshop on Mining Big Data in Climate and Environment, SIAM International Conference on Data Mining, Houston, TX.
Program of Study: