A Convexity-Dependent Two-Phase Training Algorithm for Deep Neural Networks

Tomas Hrycej; Bernhard Bermeitinger; Massimo Pavone; Götz-Henrik Wiegand; Siegfried Handschuh

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

A Convexity-Dependent Two-Phase Training Algorithm for Deep Neural Networks

Topics: Deep Learning; Machine Learning; Neural Networks

In Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: , 78-86, 2025 , Marbella, Spain

Authors: Tomas Hrycej ¹ ; Bernhard Bermeitinger ² ; Massimo Pavone ¹ ; Götz-Henrik Wiegand ¹ and Siegfried Handschuh ¹

Affiliations: ¹ Institute of Computer Science, University of St.Gallen (HSG), St. Gallen, Switzerland ; ² Institute of Computer Science in Vorarlberg, University of St. Gallen (HSG), Dornbirn, Austria

Keyword(s): Conjugate Gradient, Convexity, Adam, Computer Vision, Vision Transformer.

Abstract: The key task of machine learning is to minimize the loss function that measures the model fit to the training data. The numerical methods to do this efficiently depend on the properties of the loss function. The most decisive among these properties is the convexity or non-convexity of the loss function. The fact that the loss function can have, and frequently has, non-convex regions has led to a widespread commitment to non-convex methods such as Adam. However, a local minimum implies that, in some environment around it, the function is convex. In this environment, second-order minimizing methods such as the Conjugate Gradient (CG) give a guaranteed superlinear convergence. We propose a novel framework grounded in the hypothesis that loss functions in real-world tasks swap from initial non-convexity to convexity towards the optimum - a property we leverage to design an innovative two-phase optimization algorithm. The presented algorithm detects the swap point by observing the gradien t norm dependence on the loss. In these regions, non-convex (Adam) and convex (CG) algorithms are used, respectively. Computing experiments confirm the hypothesis that this simple convexity structure is frequent enough to be practically exploited to substantially improve convergence and accuracy. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.108

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Hrycej, T., Bermeitinger, B., Pavone, M., Wiegand, G.-H. and Handschuh, S. (2025). A Convexity-Dependent Two-Phase Training Algorithm for Deep Neural Networks. In Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR; ISBN ; ISSN 2184-3228, SciTePress, pages 78-86. DOI: 10.5220/0013696100004000

@conference{kdir25,
author={Tomas Hrycej and Bernhard Bermeitinger and Massimo Pavone and Götz{-}Henrik Wiegand and Siegfried Handschuh},
title={A Convexity-Dependent Two-Phase Training Algorithm for Deep Neural Networks},
booktitle={Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR},
year={2025},
pages={78-86},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013696100004000},
isbn={},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR
TI - A Convexity-Dependent Two-Phase Training Algorithm for Deep Neural Networks
SN -
IS - 2184-3228
AU - Hrycej, T.
AU - Bermeitinger, B.
AU - Pavone, M.
AU - Wiegand, G.
AU - Handschuh, S.
PY - 2025
SP - 78
EP - 86
DO - 10.5220/0013696100004000
PB - SciTePress