Advancements and Challenges of Large Language Model-Based Code Generation and Completion

Zheer Wang

doi:10.5220/0013271800004558

Advancements and Challenges of Large Language Model-Based Code Generation and Completion

Zheer Wang

2024

Abstract

This paper provides an in-depth review of the recent advancements and applications of large language models (LLMs) in the field of code generation and code completion. Since deep learning and transformer architectures have advanced so quickly, LLMs have shown previously unheard-of powers in producing source code from natural language, revolutionizing software development procedures. The underlying ideas of these models are first explained in the review, with particular attention to how large models such as Generative Pre-trained Transformer (GPT)-3 and Codex use pre-training and fine-tuning techniques to produce sophisticated code from descriptions in simple language. These models produce high-quality outputs by autonomously learning programming syntax and semantics and using attention techniques to capture contextual dependencies in code, contrasting with conventional rule-based or heuristic approaches. This paper also demonstrates how well LLMs perform in a variety of applications, including code translation, code completion, and error detection, as well as how effectively they function in multi-language programming environments. Additionally, models like PolyCoder and Program and Language Bidirectional and Auto-Regressive Transformers (PLBART) are emphasized because they outperform traditional methods, particularly in cross-language tasks. Although LLMs show great promise, the study also discusses some of their current drawbacks, such as their high memory consumption, opaque training data, and difficulties with generalizing to new codebases. In summary, while LLMs provide unparalleled prospects for software engineering advancement, further investigation is required to overcome current obstacles and expand their relevance to broader fields.

Download

Paper Citation

in Harvard Style

Wang Z. (2024). Advancements and Challenges of Large Language Model-Based Code Generation and Completion. In Proceedings of the 1st International Conference on Modern Logistics and Supply Chain Management - Volume 1: MLSCM; ISBN 978-989-758-738-2, SciTePress, pages 208-213. DOI: 10.5220/0013271800004558

in Bibtex Style

@conference{mlscm24,
author={Zheer Wang},
title={Advancements and Challenges of Large Language Model-Based Code Generation and Completion},
booktitle={Proceedings of the 1st International Conference on Modern Logistics and Supply Chain Management - Volume 1: MLSCM},
year={2024},
pages={208-213},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013271800004558},
isbn={978-989-758-738-2},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 1st International Conference on Modern Logistics and Supply Chain Management - Volume 1: MLSCM
TI - Advancements and Challenges of Large Language Model-Based Code Generation and Completion
SN - 978-989-758-738-2
AU - Wang Z.
PY - 2024
SP - 208
EP - 213
DO - 10.5220/0013271800004558
PB - SciTePress