
Gao, L., Ma, X., Lin, J., and Callan, J. (2023). Precise
zero-shot dense retrieval without relevance labels. In
Rogers, A., Boyd-Graber, J., and Okazaki, N., editors,
Proceedings of the 61st Annual Meeting of the Associ-
ation for Computational Linguistics (Volume 1: Long
Papers), pages 1762–1777, Toronto, Canada. Associ-
ation for Computational Linguistics.
Izacard, G., Caron, M., Hosseini, L., Riedel, S., Bo-
janowski, P., Joulin, A., and Grave, E. (2022). Unsu-
pervised dense information retrieval with contrastive
learning. Transactions on Machine Learning Re-
search.
Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L.,
Edunov, S., Chen, D., and Yih, W.-t. (2020). Dense
passage retrieval for open-domain question answer-
ing. In Webber, B., Cohn, T., He, Y., and Liu,
Y., editors, Proceedings of the 2020 Conference on
Empirical Methods in Natural Language Processing
(EMNLP), pages 6769–6781, Online. Association for
Computational Linguistics.
Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin,
V., Goyal, N., K
¨
uttler, H., Lewis, M., Yih,
W.-t., Rockt
¨
aschel, T., Riedel, S., and Kiela,
D. (2020). Retrieval-augmented generation for
knowledge-intensive nlp tasks. In Larochelle, H.,
Ranzato, M., Hadsell, R., Balcan, M., and Lin, H.,
editors, Advances in Neural Information Processing
Systems, volume 33, pages 9459–9474. Curran Asso-
ciates, Inc.
Li, Z., Zhang, X., Zhang, Y., Long, D., Xie, P., and Zhang,
M. (2023). Towards general text embeddings with
multi-stage contrastive learning.
Mackie, I., Chatterjee, S., and Dalton, J. (2023). Genera-
tive relevance feedback with large language models.
In Proceedings of the 46th International ACM SIGIR
Conference on Research and Development in Infor-
mation Retrieval, SIGIR ’23, page 2026–2031, New
York, NY, USA. Association for Computing Machin-
ery.
Maia, M., Handschuh, S., Freitas, A., Davis, B., Mc-
Dermott, R., Zarrouk, M., and Balahur, A. (2018).
Www’18 open challenge: Financial opinion mining
and question answering. In Companion Proceedings
of the The Web Conference 2018, WWW ’18, page
1941–1942, Republic and Canton of Geneva, CHE.
International World Wide Web Conferences Steering
Committee.
Mistral AI (2025). Mistral small 3.1. https://mistral.ai/
news/mistral-small-3-1. Accessed: June 25, 2025.
Muennighoff, N., Tazi, N., Magne, L., and Reimers, N.
(2023). MTEB: Massive text embedding benchmark.
In Vlachos, A. and Augenstein, I., editors, Proceed-
ings of the 17th Conference of the European Chap-
ter of the Association for Computational Linguistics,
pages 2014–2037, Dubrovnik, Croatia. Association
for Computational Linguistics.
Qdrant (2024). Qdrant documentation: Overview.
https://qdrant.tech/documentation/overview/. Ac-
cessed: June 25, 2025.
Rackauckas, Z. (2024). Rag-fusion: A new take on retrieval
augmented generation. International Journal on Nat-
ural Language Computing, 13(1):37–47.
Rashid, M. S., Meem, J. A., Dong, Y., and Hristidis, V.
(2024). Progressive query expansion for retrieval over
cost-constrained data sources.
Robertson, S. E. and Walker, S. (1994). Some simple effec-
tive approximations to the 2-poisson model for proba-
bilistic weighted retrieval. In Proceedings of the 17th
Annual International ACM SIGIR Conference on Re-
search and Development in Information Retrieval, SI-
GIR ’94, page 232–241, Berlin, Heidelberg. Springer-
Verlag.
Rocchio, J. J. (1971). Relevance feedback in information
retrieval. In Salton, G., editor, The SMART Retrieval
System: Experiments in Automatic Document Pro-
cessing, pages 313–323. Prentice-Hall, Englewood
Cliffs, NJ.
Thakur, N., Reimers, N., R
¨
uckl
´
e, A., Srivastava, A., and
Gurevych, I. (2021). BEIR: A heterogeneous bench-
mark for zero-shot evaluation of information retrieval
models. In Thirty-fifth Conference on Neural Infor-
mation Processing Systems Datasets and Benchmarks
Track (Round 2).
Wachsmuth, H., Syed, S., and Stein, B. (2018). Retrieval of
the best counterargument without prior topic knowl-
edge. In Gurevych, I. and Miyao, Y., editors, Proceed-
ings of the 56th Annual Meeting of the Association for
Computational Linguistics (Volume 1: Long Papers),
pages 241–251, Melbourne, Australia. Association for
Computational Linguistics.
Wang, L., Yang, N., and Wei, F. (2023). Query2doc: Query
expansion with large language models. In Bouamor,
H., Pino, J., and Bali, K., editors, Proceedings of the
2023 Conference on Empirical Methods in Natural
Language Processing, pages 9414–9423, Singapore.
Association for Computational Linguistics.
Weller, O., Lo, K., Wadden, D., Lawrie, D., Van Durme, B.,
Cohan, A., and Soldaini, L. (2024). When do gener-
ative query and document expansions fail? a compre-
hensive study across methods, retrievers, and datasets.
In Graham, Y. and Purver, M., editors, Findings of
the Association for Computational Linguistics: EACL
2024, pages 1987–2003, St. Julian’s, Malta. Associa-
tion for Computational Linguistics.
Xiao, S., Liu, Z., Zhang, P., Muennighoff, N., Lian, D.,
and Nie, J.-Y. (2024). C-pack: Packed resources
for general chinese embeddings. In Proceedings of
the 47th International ACM SIGIR Conference on Re-
search and Development in Information Retrieval, SI-
GIR ’24, page 641–649, New York, NY, USA. Asso-
ciation for Computing Machinery.
Zhang, Y., Li, Y., Cui, L., Cai, D., Liu, L., Fu, T., Huang, X.,
Zhao, E., Zhang, Y., Chen, Y., Wang, L., Luu, A. T.,
Bi, W., Shi, F., and Shi, S. (2023). Siren’s song in the
ai ocean: A survey on hallucination in large language
models.
Zhu, Y., Yuan, H., Wang, S., Liu, J., Liu, W., Deng, C.,
Chen, H., Liu, Z., Dou, Z., and Wen, J.-R. (2024).
Large language models for information retrieval: A
survey.
KDIR 2025 - 17th International Conference on Knowledge Discovery and Information Retrieval
516