Design of Hardware Accelerated Co-Processor for Neural Inference Computations
Kushagra Chauhan, R. Vidhyaand S. Nagadevi
2025
Abstract
The rapid growth in AI model complexity has increased demand for hardware capable of high- efficiency inference with lower energy consumption. While GPUs have traditionally driven AI advancements, their progress has plateaued with compute density improving by only 15% over four years. This work develops a highly optimized ASIC accelerator specifically for LLaMA 3.1, implementing critical inference operations through specialized computational units and memory hierarchy. The de sign achieves 100% efficiency in matrix multiplication operations and 99.90% efficiency in attention computations, demonstrating significant improvements over traditional GPU implementations. Results show sustained throughput of 256 operations per cycle in matrix multiplication and 7.99 operations per cycle for attention mechanisms, with memory bandwidth utilization of 1024GB/s. This research presents a sustainable solution for large-scale AI inference deployment, addressing both computational efficiency and energy consumption challenges.
DownloadPaper Citation
in Harvard Style
Chauhan K. and Nagadevi R. (2025). Design of Hardware Accelerated Co-Processor for Neural Inference Computations. In Proceedings of the 1st International Conference on Research and Development in Information, Communication, and Computing Technologies - ICRDICCT`25; ISBN 978-989-758-777-1, SciTePress, pages 5-10. DOI: 10.5220/0013921700004919
in Bibtex Style
@conference{icrdicct`2525,
author={Kushagra Chauhan and R. Nagadevi},
title={Design of Hardware Accelerated Co-Processor for Neural Inference Computations},
booktitle={Proceedings of the 1st International Conference on Research and Development in Information, Communication, and Computing Technologies - ICRDICCT`25},
year={2025},
pages={5-10},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013921700004919},
isbn={978-989-758-777-1},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 1st International Conference on Research and Development in Information, Communication, and Computing Technologies - ICRDICCT`25
TI - Design of Hardware Accelerated Co-Processor for Neural Inference Computations
SN - 978-989-758-777-1
AU - Chauhan K.
AU - Nagadevi R.
PY - 2025
SP - 5
EP - 10
DO - 10.5220/0013921700004919
PB - SciTePress