A Comprehensive Approach to Urban Sound Detection with YAMNet and Bi-Directional LSTM

Tonghui Wu

2024

Abstract

Urban sound event detection and classification are increasingly critical in addressing the challenges posed by complex urban environments. As urbanization intensifies globally, traditional classification methods struggle with the overlapping sounds typical of cities. Leveraging advances in deep learning, this research aimed to enhance the accuracy of urban sound classification, which is essential for applications ranging from audio signal processing to noise monitoring to public safety. Utilizing the UrbanSound8K dataset, YAMNet, a pre-trained neural network, was combined with a custom Bidirectional LSTM network to develop a robust classification model. The model was evaluated using cross-validation, achieving a high Matthews Correlation Coefficient (MCC), indicating strong generalization to unseen data. Despite these positive outcomes, areas for further improvement were identified, particularly in distinguishing between acoustically similar sounds. This research contributes to advancing urban sound classification by integrating transfer learning and deep learning techniques, offering a solid foundation for future exploration in complex audio classification tasks and setting the stage for potential real-world applications.

Download


Paper Citation


in Harvard Style

Wu T. (2024). A Comprehensive Approach to Urban Sound Detection with YAMNet and Bi-Directional LSTM. In Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML; ISBN 978-989-758-754-2, SciTePress, pages 23-29. DOI: 10.5220/0013486700004619


in Bibtex Style

@conference{daml24,
author={Tonghui Wu},
title={A Comprehensive Approach to Urban Sound Detection with YAMNet and Bi-Directional LSTM},
booktitle={Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML},
year={2024},
pages={23-29},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013486700004619},
isbn={978-989-758-754-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML
TI - A Comprehensive Approach to Urban Sound Detection with YAMNet and Bi-Directional LSTM
SN - 978-989-758-754-2
AU - Wu T.
PY - 2024
SP - 23
EP - 29
DO - 10.5220/0013486700004619
PB - SciTePress