
ACKNOWLEDGEMENTS
The research was sponsored by the Army Research
Office and was accomplished under Grant Number
W911NF-22-1-0035. The views and conclusions con-
tained in this document are those of the authors and
should not be interpreted as representing the official
policies, either expressed or implied, of the Army
Research Office or the U.S. Government. The U.S.
Government is authorized to reproduce and distribute
reprints for Government purposes notwithstanding
any copyright notation herein.
REFERENCES
Behnam, A. and Wang, B. (2024). Graph neural network
causal explanation via neural causal models. In Euro-
pean Conference on Computer Vision, pages 410–427.
Springer.
Bello, K., Aragam, B., and Ravikumar, P. (2022).
DAGMA: Learning DAGs via M-matrices and a Log-
Determinant Acyclicity Characterization. In Advances
in Neural Information Processing Systems.
Bouckaert, R. R. (1993). Probabilistic network construc-
tion using the minimum description length principle.
In European conference on symbolic and quantitative
approaches to reasoning and uncertainty, pages 41–
48. Springer.
Brouillard, P., Lachapelle, S., Lacoste, A., Lacoste-Julien,
S., and Drouin, A. (2020). Differentiable causal dis-
covery from interventional data. Advances in Neural
Information Processing Systems, 33:21865–21877.
Chickering, D. M. (2002). Optimal structure identification
with greedy search. Journal of machine learning re-
search, 3(Nov):507–554.
Fonollosa, J. A. (2019). Conditional distribution variability
measures for causality detection. Cause Effect Pairs
in Machine Learning, pages 339–347.
G
´
amez, J. A., Mateo, J. L., and Puerta, J. M. (2011).
Learning bayesian networks by hill climbing: effi-
cient methods based on progressive restriction of the
neighborhood. Data Mining and Knowledge Discov-
ery, 22:106–148.
Gao, H., Yao, C., Li, J., Si, L., Jin, Y., Wu, F., Zheng, C.,
and Liu, H. (2024). Rethinking causal relationships
learning in graph neural networks. In Proceedings of
the AAAI Conference on Artificial Intelligence, vol-
ume 38, pages 12145–12154.
Geffner, T., Antoran, J., Foster, A., Gong, W., Ma, C.,
Kiciman, E., Sharma, A., Lamb, A., Kukla, M.,
Pawlowski, N., Allamanis, M., and Zhang, C. (2022).
Deep end-to-end causal inference. arXiv preprint
arXiv:2202.02195.
Hamilton, W., Ying, Z., and Leskovec, J. (2017). Inductive
representation learning on large graphs. Advances in
neural information processing systems, 30.
Heckerman, D., Geiger, D., and Chickering, D. M. (1995).
Learning bayesian networks: The combination of
knowledge and statistical data. Machine learning,
20:197–243.
Jiang, W., Liu, H., and Xiong, H. (2023). When
graph neural network meets causality: Opportuni-
ties, methodologies and an outlook. arXiv preprint
arXiv:2312.12477.
Job, S., Tao, X., Cai, T., Xie, H., Li, L., Li, Q., and Yong,
J. (2025). Exploring causal learning through graph
neural networks: An in-depth review. Wiley Interdis-
ciplinary Reviews: Data Mining and Knowledge Dis-
covery, 15(2):e70024.
Kipf, T. N. and Welling, M. (2016). Semi-supervised clas-
sification with graph convolutional networks. arXiv
preprint arXiv:1609.02907.
Koller, D. and Friedman, N. (2009). Probabilistic graphical
models: principles and techniques. MIT press.
Lacerda, G., Spirtes, P. L., Ramsey, J., and Hoyer,
P. O. (2012). Discovering cyclic causal models by
independent components analysis. arXiv preprint
arXiv:1206.3273.
Li, H., Xiao, Q., and Tian, J. (2020). Supervised whole dag
causal discovery. arXiv preprint arXiv:2006.04697.
Lin, W., Lan, H., and Li, B. (2021). Generative causal ex-
planations for graph neural networks. In International
Conference on Machine Learning, pages 6666–6679.
PMLR.
Lorch, L., Rothfuss, J., Sch
¨
olkopf, B., and Krause, A.
(2021). Dibs: Differentiable bayesian structure learn-
ing. Advances in Neural Information Processing Sys-
tems, 34.
Lorch, L., Sussex, S., Rothfuss, J., Krause, A., and
Sch
¨
olkopf, B. (2022). Amortized inference for causal
structure learning. Advances in Neural Information
Processing Systems, 35:13104–13118.
McDonald, R. and Pereira, F. (2006). Online learning of
approximate dependency parsing algorithms. In 11th
Conference of the European Chapter of the Associa-
tion for Computational Linguistics, pages 81–88.
Mohammadi, A. and Wit, E. C. (2015). Bayesian structure
learning in sparse gaussian graphical models.
Mohan, K., Chung, M., Han, S., Witten, D., Lee, S.-I.,
and Fazel, M. (2012). Structured learning of gaus-
sian graphical models. Advances in neural informa-
tion processing systems, 25.
Ng, I., Zhu, S., Chen, Z., and Fang, Z. (2019). A graph au-
toencoder approach to causal structure learning. arXiv
preprint arXiv:1911.07420.
Ott, S., Imoto, S., and Miyano, S. (2003). Finding opti-
mal models for small gene networks. In Biocomputing
2004, pages 557–567. World Scientific.
Pearl, J. (2019). The seven tools of causal inference, with
reflections on machine learning. Communications of
the ACM, 62(3):54–60.
Peters, J., Janzing, D., and Sch
¨
olkopf, B. (2017). Elements
of causal inference: foundations and learning algo-
rithms. The MIT Press.
KDIR 2025 - 17th International Conference on Knowledge Discovery and Information Retrieval
346