
REFERENCES
Bommasani, R., Hudson, D. A., Adeli, E., Altman, R.,
Arora, S., von Arx, S., Bernstein, M. S., Bohg, J.,
Bosselut, A., Brunskill, E., et al. (2021). On the
opportunities and risks of foundation models. arXiv
preprint arXiv:2108.07258.
Chalkidis, I. and et al. (2021). Lexglue: A benchmark
dataset for legal language understanding in english.
EMNLP.
Doshi-Velez, F. and Kim, B. (2017). Towards a rigorous sci-
ence of interpretable machine learning. arXiv preprint
arXiv:1702.08608.
Galgani, F. and et al. (2021). Legal text analytics: Opportu-
nities, challenges and future directions. Artificial In-
telligence and Law, 29(2):219–250.
Hirvonen-Ere, S. (2023). Contract lifecycle management
as a catalyst for digitalization in the european union.
In Digital Development of the European Union, pages
85–99. Springer.
Huang, W., Xia, F., Xiao, T., Chan, H., Liang, J., Florence,
P., Zeng, A., Tompson, J., Mordatch, I., Chebotar, Y.,
et al. (2022). Inner monologue: Embodied reason-
ing through planning with language models. arXiv
preprint arXiv:2207.05608.
Ji, B., Liu, H., Zhu, J., Yang, Y., Tang, J., et al. (2023). A
survey of post-hoc explanation methods for deep neu-
ral networks. IEEE Transactions on Neural Networks
and Learning Systems.
Liang, Y. and et al. (2023). Symbolic knowledge distilla-
tion: From general language models to commonsense
models. arXiv preprint arXiv:2304.09828.
Long, Y., Peng, B., Lin, X., Liu, X., and Gao,
J. (2024). Evaluating tree-of-thought prompting
for multi-hop question answering. arXiv preprint
arXiv:2402.01816.
Malik, S. and et al. (2023). Xai in legal ai: Survey and
challenges. In Proceedings of ICAIL.
Miller, T. (2019). Explanation in artificial intelligence: In-
sights from the social sciences. Artificial intelligence,
267:1–38.
Nye, M., Andreassen, A. J., Gur-Ari, G., Michalewski, H.,
Austin, J., Bieber, D., Dohan, D., Lewkowycz, A.,
Bosma, M., Luan, D., et al. (2021). Show your work:
Scratchpads for intermediate computation with lan-
guage models.
OpenAI (2023a). Chatgpt fine-tune descrip-
tion. https://help.openai.com/en/articles/
6783457-what-is-chatgpt. Accessed: 2024-03-
01.
OpenAI (2023b). Chatgpt prompt engineer-
ing. https://platform.openai.com/docs/guides/
prompt-engineering. Accessed: 2024-04-01.
Pascanu, R., Mikolov, T., and Bengio, Y. (2013). On the
difficulty of training recurrent neural networks. In
International conference on machine learning, pages
1310–1318. Pmlr.
Rajani, N. F., McCann, B., Xiong, C., and Socher, R.
(2019). Explain yourself! leveraging language
models for commonsense reasoning. arXiv preprint
arXiv:1906.02361.
Seabra, A., Cavalcante, C., Nepomuceno, J., Lago, L., Ru-
berg, N., and Lifschitz, S. (2024). Contrato360 2.0: A
document and database-driven question-answer sys-
tem using large language models and agents. In Pro-
ceedings of the 16th International Joint Conference on
Knowledge Discovery, Knowledge Engineering and
Knowledge Management (KDIR).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.
(2017). Attention is all you need. Advances in neural
information processing systems, 30.
Vig, J. and Belinkov, Y. (2019). Analyzing the structure
of attention in a transformer language model. arXiv
preprint arXiv:1906.04284.
Wang, M., Wang, M., Xu, X., Yang, L., Cai, D., and Yin,
M. (2023). Unleashing chatgpt’s power: A case study
on optimizing information retrieval in flipped class-
rooms via prompt engineering. IEEE Transactions on
Learning Technologies.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter,
B., Xia, F., Chi, E., Le, Q. V., and Zhou, D. (2022).
Chain of thought prompting elicits reasoning in large
language models. In Advances in Neural Information
Processing Systems.
White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert,
H., Elnashar, A., Spencer-Smith, J., and Schmidt,
D. C. (2023). A prompt pattern catalog to enhance
prompt engineering with chatgpt. arXiv preprint
arXiv:2302.11382.
Wiegreffe, S., Marasovi
´
c, A., Gehrmann, S., and Smith,
N. A. (2022). Reframing human ”explanations”: A
contrastive look at model rationales. In Proceedings of
the 60th Annual Meeting of the Association for Com-
putational Linguistics, pages 4680–4696.
Xiao, C., Hu, X., Liu, Z., Tu, C., and Sun, M. (2021). Law-
former: A pre-trained language model for chinese le-
gal long documents. AI Open, 2:79–84.
Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T., Cao, Y.,
and Narasimhan, K. (2023a). Tree of thoughts: De-
liberate problem solving with large language models.
Advances in neural information processing systems,
36:11809–11822.
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan,
K., and Cao, Y. (2023b). React: Synergizing reason-
ing and acting in language models. In International
Conference on Learning Representations (ICLR).
Zhou, D., Schuurmans, D., Bai, Y., Wang, X., Zhang, T.,
Bousquet, O., and Chi, E. H. (2023). Least-to-most
prompting enables complex reasoning in large lan-
guage models. In International Conference on Learn-
ing Representations.
Zhu, Z. and et al. (2023). Cost: Chain of structured
thought for zero-shot reasoning. arXiv preprint
arXiv:2305.12461.
Comparing Chain and Tree-Based Reasoning for Explainable Knowledge Discovery in Contract Analytics Using Large Language Models
435