Embeddings Might Be all You Need: Domain-Specific Sentence Encoders for Latin American E-Commerce Questions
Rodrigo Caus, Rodrigo Caus, Victor Sotelo, Victor Sotelo, Victor Hochgreb, Julio Reis
2025
Abstract
In Latin American e-commerce, customer inquiries often exhibit unique linguistic patterns that require specialized handling for accurate responses. Traditional sentence encoders may struggle with these regional nuances, leading to less effective answers. This study investigates the application of fine-tuned transformer models to generate domain-specific sentence embeddings, focusing on Portuguese and Spanish retrieval tasks. Our findings demonstrate that these specialized embeddings significantly outperform general-purpose pre-trained models and traditional techniques, such as BM-25, thereby eliminating the need for additional re-ranking steps in retrieval processes. Our results investigate the impact of multi-objective training within Matryoshka Representation Learning, demonstrating its effectiveness in maintaining retrieval performance across various embedding dimensions. Our approach offers a scalable and efficient solution for multilingual retrieval in e-commerce, reducing computational costs while ensuring high accuracy.
DownloadPaper Citation
in Harvard Style
Caus R., Sotelo V., Hochgreb V. and Reis J. (2025). Embeddings Might Be all You Need: Domain-Specific Sentence Encoders for Latin American E-Commerce Questions. In Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR; ISBN , SciTePress, pages 135-146. DOI: 10.5220/0013821200004000
in Bibtex Style
@conference{kdir25,
author={Rodrigo Caus and Victor Sotelo and Victor Hochgreb and Julio Reis},
title={Embeddings Might Be all You Need: Domain-Specific Sentence Encoders for Latin American E-Commerce Questions},
booktitle={Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR},
year={2025},
pages={135-146},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013821200004000},
isbn={},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR
TI - Embeddings Might Be all You Need: Domain-Specific Sentence Encoders for Latin American E-Commerce Questions
SN -
AU - Caus R.
AU - Sotelo V.
AU - Hochgreb V.
AU - Reis J.
PY - 2025
SP - 135
EP - 146
DO - 10.5220/0013821200004000
PB - SciTePress