Dynamically Choosing the Number of Heads in Multi-Head Attention

Fernando Fradique Duarte; Nuno Lau; Artur Pereira; Luís Reis

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Dynamically Choosing the Number of Heads in Multi-Head Attention

Topics: Deep Learning; Neural Networks

In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, 358-367, 2024 , Rome, Italy

Authors: Fernando Fradique Duarte ¹ ; Nuno Lau ² ; Artur Pereira ² and Luís Reis ³

Affiliations: ¹ Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal ; ² Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal ; ³ Faculty of Engineering, Department of Informatics Engineering, University of Porto, Porto, Portugal

Keyword(s): Deep Reinforcement Learning, Multi-Head Attention, Advantage Actor-Critic.

Abstract: Deep Learning agents are known to be very sensitive to their parameterization values. Attention-based Deep Reinforcement Learning agents further complicate this issue due to the additional parameterization associated to the computation of their attention function. One example of this concerns the number of attention heads to use when dealing with multi-head attention-based agents. Usually, these hyperparameters are set manually, which may be neither optimal nor efficient. This work addresses the issue of choosing the appropriate number of attention heads dynamically, by endowing the agent with a policy πh trained with policy gradient. At each timestep of agent-environment interaction, πh is responsible for choosing the most suitable number of attention heads according to the contextual memory of the agent. This dynamic parameterization is compared to a static parameterization in terms of performance. The role of πh is further assessed by providing additional analysis concerning the d istribution of the number of attention heads throughout the training procedure and the course of the game. The Atari 2600 videogame benchmark was used to perform and validate all the experiments. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.157

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Fradique Duarte, F., Lau, N., Pereira, A., Reis and L. (2024). Dynamically Choosing the Number of Heads in Multi-Head Attention. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-680-4; ISSN 2184-433X, SciTePress, pages 358-367. DOI: 10.5220/0012384500003636

@conference{icaart24,
author={Fernando {Fradique Duarte} and Nuno Lau and Artur Pereira and Luís Reis},
title={Dynamically Choosing the Number of Heads in Multi-Head Attention},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2024},
pages={358-367},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012384500003636},
isbn={978-989-758-680-4},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - Dynamically Choosing the Number of Heads in Multi-Head Attention
SN - 978-989-758-680-4
IS - 2184-433X
AU - Fradique Duarte, F.
AU - Lau, N.
AU - Pereira, A.
AU - Reis, L.
PY - 2024
SP - 358
EP - 367
DO - 10.5220/0012384500003636
PB - SciTePress