year, demonstrating the economic value of deploying
intelligent surveillance in small-scale businesses.
The integration of computer vision and deep
learning represents a robust, scalable, and cost-
effective solution to enhance security in vulnerable
commercial settings, particularly where resources are
limited.
Future work will explore the integration of
Internet of Things (IoT) components, such as shelf
pressure sensors or RFID systems, to provide multi-
source behavioral analysis and contextual awareness.
Additionally, the adoption of edge computing
architectures (e.g., NVIDIA Jetson or Raspberry Pi)
is proposed to enable faster, on-device processing and
improve system performance in environments with
limited connectivity.
REFERENCES
Asociación de Bodegueros del Perú. (2022). Statistical
report about losses caused by thefts in minimarkets.
https://surl.li/cznuvu
Cao, Z., Hidalgo, G., Simon, T., Wei, S.-E., & Sheikh, Y.
(2021). OpenPose: Realtime multi-person 2D pose
estimation using Part Affinity Fields. https://arxiv.org/
abs/1812.08008
De Paula, D. D., Salvadeo, D. H. P., & De Araujo, D. M.
N. (2022). CamNuvem: A robbery dataset for video
anomaly detection. Sensors, 22(24), 10016. https://doi.
org/10.3390/s222410016
Gawande, U., Hajari, K., & Golhar, Y. (2023). Real-time
deep learning approach for pedestrian detection and
suspicious activity recognition. Procedia Computer
Science, 218, 2438–2447. https://doi.org/10.1016/j.
procs.2023.01.219
Han, L., Feng, H., Liu, G., Zhang, A., & Han, T. (2024). A
real-time intelligent monitoring method for indoor
evacuation distribution based on deep learning and
spatial division. Journal of Building Engineering, 92,
109764. https://doi.org/10.1016/j.jobe.2024.109764
Horng, S., & Huang, P. (2022). Building unmanned store
identification systems using YOLOv4 and Siamese
network. Applied Sciences, 12(8), 3826. https://doi.
org/10.3390/app12083826
Kakadiya, R., Lemos, R., Mangalan, S., Pillai, M., &
Nikam, S. (2019). AI based automatic robbery/theft
detection using smart surveillance in banks. 2019 3rd
International Conference on Electronics,
Communication and Aerospace Technology (ICECA).
https://doi.org/10.1109/ICECA.2019.8822186
Kim, S., Hwang, S., & Hong, S. H. (2021). Identifying
shoplifting behaviors and inferring behavioral intention
based on human action detection and sequence analysis.
Advanced Engineering Informatics, 50, 101399.
https://doi.org/10.1016/j.aei.2021.101399
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012).
ImageNet classification with deep convolutional neural
networks. In Advances in Neural Information
Processing Systems, 25, 1097–1105. https://dx.doi.
org/10.1145/3065386
Nguyen, H. H., Ta, T. N., Nguyen, N. C., Bui, V. T., Pham,
H. M., & Nguyen, D. M. (2021). YOLO based real-time
human detection for smart video surveillance at the
edge. In IEEE Eighth International Conference on
Communications and Electronics (ICCE). https://doi.
org/10.1109/ICCE48956.2021.9352144
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., et
al. (2019). PyTorch: An imperative style, high-
performance deep learning library. In Advances in
Neural Information Processing Systems, 32.
https://papers.neurips.cc/paper/9015-pytorch-an-
imperative-style-high-performance-deep-learning-
library.pdf
Policía Nacional del Perú. (2024). Police statistical
bulletin I quarter 2024. https://www.policia.gob.
pe/estadisticopnp/documentos/boletin-2024/Boletin%
20I%20Trimestre%202024.pdf
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016).
You only look once: Unified, real-time object detection.
In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (CVPR),
https://doi.org/779–788. 10.1109/CVPR.2016.91
Santos, T., Oliveira, H., & Cunha, A. (2024). Systematic
review on weapon detection in surveillance footage
through deep learning. Computer Science Review, 51,
100612. https://doi.org/10.1016/j.cosrev.2023.100612
Ultralytics. (n.d.). YOLOv8 documentation. https://docs.
ultralytics.com/
Valera, M., & Velastin, S. A. (2005). Intelligent distributed
surveillance systems: A review. IEE Proceedings –
Vision, Image and Signal Processing, 152(2), 192–204.
https://doi.org/10.5220/0001936803140319
Wang, H., Wang, C., & Zhang, J. (2020). Human behavior
recognition in surveillance video based on 3D skeleton
information. Sensors, 20(3), 1–15. https://doi.org/
10.3390/s23115024
Zhang, Y., Jin, S., Wu, Y., Zhao, T., Yan, Y., Li, Z., & Li,
Y. (2020). A new intelligent supermarket security
system. Neural Network World, 30(2), 113–131.
https://doi.org/10.14311/nnw.2020.30.009.