in northern Quebec. First, videos are filtered, stabi-
lized, and then, a relevance score for object detection
is proposed and computed for each stabilized video
to predict if it is relevant for object detection. Sec-
ond, the YOLOv5 model is adopted to incorporate
enriched features and detect small, medium, and large
objects. Object detection results are then exploited to
extract relevant information about weather, resources,
and habitats found in the environment in which cari-
bou and black bears live. Finally, the environmental
information is analyzed and statistical results are vi-
sualized for each stabilized video. In this work, we
have conducted an experimental study where we fo-
cused on evaluating each phase of our proposed ap-
proach. It is worth to note that the proposed stabiliza-
tion method, based on motion compensation with dif-
ferent parameter combinations, can improve the qual-
ity of the videos. Also, the YOLOv5m model was
significantly better than the YOLOv5s model and can
detect small, medium, and large objects. Moreover,
the obtained results show that our method can extract
weather, habitat, and resource classes and then de-
termine their percentage of appearance in videos. In
future research, the network model structure will be
improved to analyze animal behavior using a wildlife
dataset.
REFERENCES
Adam, M., Tomášek, P., Lehej
ˇ
cek, J., Trojan, J., and J˚unek,
T. (2021). The role of citizen science and deep learn-
ing in camera trapping. Sustainability, 13(18):10287.
Bay, H., Tuytelaars, T., and Gool, L. V. (2006). Surf:
Speeded up robust features. In European conference
on computer vision, pages 404–417. Springer.
Clapham, M., Miller, E., Nguyen, M., and Darimont, C. T.
(2020). Automated facial recognition for wildlife that
lack unique markings: A deep learning approach for
brown bears. Ecology and evolution, 10(23):12883–
12892.
Dalal, N. and Triggs, B. (2005). Histograms of oriented
gradients for human detection. In 2005 IEEE com-
puter society conference on computer vision and pat-
tern recognition (CVPR’05), volume 1, pages 886–
893. Ieee.
Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014).
Rich feature hierarchies for accurate object detec-
tion and semantic segmentation. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 580–587.
Guilluy, W., Oudre, L., and Beghdadi, A. (2021). Video
stabilization: Overview, challenges and perspec-
tives. Signal Processing: Image Communication,
90:116015.
Jintasuttisak, T., Leonce, A., Sher Shah, M., Khafaga,
T., Simkins, G., and Edirisinghe, E. (2022). Deep
learning based animal detection and tracking in drone
video footage. In Proceedings of the 8th International
Conference on Computing and Artificial Intelligence,
pages 425–431.
Jocher, G. et al. (2022). ultralytics/yolov5: v6. 2-
yolov5 classification models, apple m1, reproducibil-
ity, clearml and deci. ai integrations. Zenodo. org.
Kulkarni, S., Bormane, D., and Nalbalwar, S. (2016). Stabi-
lization of jittery videos using feature point matching
technique. In International Conference on Commu-
nication and Signal Processing 2016 (ICCASP 2016),
pages 708–717. Atlantis Press.
Lakshmi, R. K. and Savarimuthu, N. (2022). Pldd—a
deep learning-based plant leaf disease detection. IEEE
Consumer Electronics Magazine, 11(3):44–49.
Lowe, D. G. (2004). Distinctive image features from scale-
invariant keypoints. International journal of computer
vision, 60(2):91–110.
Norouzzadeh, M. S., Nguyen, A., Kosmala, M., Swanson,
A., Palmer, M. S., Packer, C., and Clune, J. (2018).
Automatically identifying, counting, and describing
wild animals in camera-trap images with deep learn-
ing. Proceedings of the National Academy of Sciences,
115(25):E5716–E5725.
Redmon, J., Divvala, S., Girshick, R., and Farhadi, A.
(2016). You only look once: Unified, real-time object
detection. In Proceedings of the IEEE conference on
computer vision and pattern recognition, pages 779–
788.
Redmon, J. and Farhadi, A. (2017). Yolo9000: better, faster,
stronger. In Proceedings of the IEEE conference on
computer vision and pattern recognition, pages 7263–
7271.
Redmon, J. and Farhadi, A. (2018). Yolov3: An incremental
improvement. arXiv preprint arXiv:1804.02767.
Shi, Z., Shi, F., Lai, W.-S., Liang, C.-K., and Liang, Y.
(2022). Deep online fused video stabilization. In Pro-
ceedings of the IEEE/CVF Winter Conference on Ap-
plications of Computer Vision, pages 1250–1258.
Sunil, C., Jaidhar, C., and Patil, N. (2021). Cardamom plant
disease detection approach using efficientnetv2. IEEE
Access, 10:789–804.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.,
Anguelov, D., Erhan, D., Vanhoucke, V., and Rabi-
novich, A. (2015). Going deeper with convolutions.
In Proceedings of the IEEE conference on computer
vision and pattern recognition, pages 1–9.
Uijlings, J. R., Van De Sande, K. E., Gevers, T., and
Smeulders, A. W. (2013). Selective search for object
recognition. International journal of computer vision,
104(2):154–171.
Viola, P. and Jones, M. J. (2004). Robust real-time face
detection. International journal of computer vision,
57(2):137–154.
Wang, C.-Y., Bochkovskiy, A., and Liao, H.-Y. M. (2022).
Yolov7: Trainable bag-of-freebies sets new state-of-
the-art for real-time object detectors. arXiv preprint
arXiv:2207.02696.
Environmental Information Extraction Based on YOLOv5-Object Detection in Videos Collected by Camera-Collars Installed on Migratory
Caribou and Black Bears in Northern Quebec
667