When Are 1.58 Bits Enough? A Bottom-up Exploration of Quantization-Aware Training with Ternary Weights

Jacob Nielsen; Lukas Galke; Peter Schneider-Kamp

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

When Are 1.58 Bits Enough? A Bottom-up Exploration of Quantization-Aware Training with Ternary Weights

Topics: Deep Learning; Machine Learning; Natural Language Processing; Neural Networks

In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART, 1440-1449, 2025 , Porto, Portugal

Authors: Jacob Nielsen ; Lukas Galke and Peter Schneider-Kamp

Affiliation: Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark

Keyword(s): Quantization, Quantization-Aware Training, Graph Neural Networks, Text Classification, Language Models.

Abstract: Contemporary machine learning models, such as language models, are powerful, but come with immense resource requirements both at training and inference time. Quantization aware pre-training with ternary weights (1.58 bits per weight) has shown promising results in decoder-only language models and facilitates memory-efficient inference. However, little is known about how quantization-aware training influences the training dynamics beyond such Transformer-based decoder-only language models. Here, we engage in a bottom-up exploration of quantization-aware training, starting with multi-layer perceptrons and graph neural networks. Then, we explore 1.58-bit training in other transformer-based language models: encoder-only and encoder-decoder models. Our results show that in all of these settings, 1.58-bit training is on par with standard 32/16-bit models, yet we also identify challenges specific to 1.58-bit encoder-decoder models. Our results on decoder-only language models hint at a possi ble regularization effect introduced by quantization-aware training. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.12

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Nielsen, J., Galke, L., Schneider-Kamp and P. (2025). When Are 1.58 Bits Enough? A Bottom-up Exploration of Quantization-Aware Training with Ternary Weights. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-737-5; ISSN 2184-433X, SciTePress, pages 1440-1449. DOI: 10.5220/0013382400003890

@conference{icaart25,
author={Jacob Nielsen and Lukas Galke and Peter Schneider{-}Kamp},
title={When Are 1.58 Bits Enough? A Bottom-up Exploration of Quantization-Aware Training with Ternary Weights},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2025},
pages={1440-1449},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013382400003890},
isbn={978-989-758-737-5},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - When Are 1.58 Bits Enough? A Bottom-up Exploration of Quantization-Aware Training with Ternary Weights
SN - 978-989-758-737-5
IS - 2184-433X
AU - Nielsen, J.
AU - Galke, L.
AU - Schneider-Kamp, P.
PY - 2025
SP - 1440
EP - 1449
DO - 10.5220/0013382400003890
PB - SciTePress