
Chen, B., Zhang, Z., Langren
´
e, N., and Zhu, S. (2023). Un-
leashing the potential of prompt engineering in large
language models: a comprehensive review. arXiv
preprint arXiv:2310.14735.
Chowdhary, K. and Chowdhary, K. (2020). Natural lan-
guage processing. Fundamentals of artificial intelli-
gence, pages 603–649.
Comanici, G., Bieber, E., Schaekermann, M., Pasupat, I.,
Sachdeva, N., Dhillon, I., Blistein, M., Ram, O.,
Zhang, D., Rosen, E., et al. (2025). Gemini 2.5: Push-
ing the frontier with advanced reasoning, multimodal-
ity, long context, and next generation agentic capabil-
ities. arXiv preprint arXiv:2507.06261.
Corr
ˆ
ea, N. K., Sen, A., Falk, S., and Fatimah, S. (2024).
Tucano: Advancing Neural Text Generation for Por-
tuguese.
Cosme, D., Galv
˜
ao, A., and Abreu, F. B. E. (2024). A sys-
tematic literature review on llm-based information re-
trieval: The issue of contents classification. In Pro-
ceedings of the 16th International Joint Conference on
Knowledge Discovery, Knowledge Engineering and
Knowledge Management (KDIR), pages 1–12.
Dagdelen, J., Dunn, A., Lee, S., Walker, N., Rosen, A. S.,
Ceder, G., Persson, K. A., and Jain, A. (2024). Struc-
tured information extraction from scientific text with
large language models. Nature Communications,
15(1):1418. Publisher: Nature Publishing Group.
G, G. M., Abhi, S., and Agarwal, R. (2023). A hybrid
resume parser and matcher using regex and ner. In
2023 International Conference on Advances in Com-
putation, Communication and Information Technol-
ogy (ICAICCIT), pages 24–29.
Gan, C. and Mori, T. (2023). A few-shot approach to
resume information extraction via prompts. In In-
ternational Conference on Applications of Natural
Language to Information Systems, pages 445–455.
Springer.
Gomes, L., Branco, A., Silva, J., Rodrigues, J., and Santos,
R. (2024). Open sentence embeddings for portuguese
with the serafim pt* encoders family. In Santos,
M. F., Machado, J., Novais, P., Cortez, P., and Mor-
eira, P. M., editors, Progress in Artificial Intelligence,
pages 267–279, Cham. Springer Nature Switzerland.
Grishman, R. (2015). Information extraction. IEEE Intelli-
gent Systems, 30(5):8–15.
Gu, J., Jiang, X., Shi, Z., Tan, H., Zhai, X., Xu, C., Li,
W., Shen, Y., Ma, S., Liu, H., Wang, Y., and Guo, J.
(2024). A survey on llm-as-a-judge. ArXiv.
Herandi, A., Li, Y., Liu, Z., Hu, X., and Cai, X. (2024).
Skill-llm: Repurposing general-purpose llms for skill
extraction. arXiv preprint arXiv:2410.12052.
Li, X., Shu, H., Zhai, Y., and Lin, Z. (2021). A method for
resume information extraction using bert-bilstm-crf.
In 2021 IEEE 21st International Conference on Com-
munication Technology (ICCT), pages 1437–1442.
Li, Y., Krishnamurthy, R., Raghavan, S., Vaithyanathan, S.,
and Jagadish, H. (2008). Regular expression learning
for information extraction. In Proceedings of the 2008
conference on empirical methods in natural language
processing, pages 21–30.
Marrero, M., Urbano, J., S
´
anchez-Cuadrado, S., Morato, J.,
and G
´
omez-Berb
´
ıs, J. M. (2013). Named entity recog-
nition: fallacies, challenges and opportunities. Com-
puter Standards & Interfaces, 35(5):482–489.
Melo, A., Cabral, B., and Claro, D. B. (2024). Scaling and
adapting large language models for portuguese open
information extraction: A comparative study of fine-
tuning and lora. In Brazilian Conference on Intelligent
Systems, pages 427–441. Springer.
Nguyen, K. C., Zhang, M., Montariol, S., and Bosselut, A.
(2024). Rethinking skill extraction in the job market
domain using large language models. arXiv preprint
arXiv:2402.03832.
OpenAI (2024). Gpt-4o system card.
Perot, V., Kang, K., Luisier, F., Su, G., Sun, X., Boppana,
R. S., Wang, Z., Wang, Z., Mu, J., Zhang, H., Lee,
C.-Y., and Hua, N. (2024). Lmdx: Language model-
based document information extraction and localiza-
tion. ArXiv.
Pires, R., Abonizio, H., Almeida, T., and Nogueira, R.
(2023). Sabi
´
a: Portuguese large language models. In
Anais da XII Brazilian Conference on Intelligent Sys-
tems, pages 226–240, Porto Alegre, RS, Brasil. SBC.
Sahoo, P., Singh, A. K., Saha, S., Jain, V., Mondal, S., and
Chadha, A. (2024). A systematic survey of prompt
engineering in large language models: Techniques and
applications. arXiv preprint arXiv:2402.07927.
Sougandh, T. G., Reddy, N. S., Belwal, M., et al. (2023).
Automated resume parsing: A natural language pro-
cessing approach. In 2023 7th International Confer-
ence on Computation System and Information Tech-
nology for Sustainable Solutions (CSITSS), pages 1–6.
IEEE.
Vieira, R., Olival, F., Cameron, H., Santos, J., Sequeira, O.,
and Santos, I. (2021). Enriching the 1758 portuguese
parish memories (alentejo) with named entities. Jour-
nal of Open Humanities Data, 7:20.
Villena, F., Miranda, L., and Aracena, C. (2024). llm-
ner:(zero— few)-shot named entity recognition, ex-
ploiting the power of large language models. arXiv
preprint arXiv:2406.04528.
Wei, H., He, S., Xia, T., Wong, A., Lin, J., and Han, M.
(2024). Systematic evaluation of llm-as-a-judge in
llm alignment tasks: Explainable metrics and diverse
prompt templates. ArXiv, abs/2408.13006.
Werner, M. and Laber, E. (2024). Extracting section struc-
ture from resumes in brazilian portuguese. Expert Sys-
tems with Applications, 242:122495.
Xu, D., Chen, W., Peng, W., Zhang, C., Xu, T., Zhao, X.,
Wu, X., Zheng, Y., Wang, Y., and Chen, E. (2023).
Large language models for generative information ex-
traction: A survey. arXiv preprint arXiv:2312.17617.
Yang, A., Li, A., Yang, B., Zhang, B., Hui, B., Zheng,
B., Yu, B., Gao, C., Huang, C., Lv, C., et al.
(2025). Qwen3 technical report. arXiv preprint
arXiv:2505.09388.
WEBIST 2025 - 21st International Conference on Web Information Systems and Technologies
324