provided 50 images, which were carefully selected to
cover different shooting angles, lighting conditions,
and poses, thus ensuring the diversity of the data.
The main objectives of this study are to compare
the accuracy, precision, recall, and F1 scores of
MobileNet V1, MobileNet V2, and EfficientNet B0
in classifying endangered animals, analyze the impact
of differences in model architectures on the
classification performance of complex features and
explore the practical application potential of these
models in wildlife monitoring.
In this study, uniform experimental parameters
were used to ensure fairness. For example, the
number of training rounds for each of the three
models was set to 100, and data enhancement
techniques and automatic category weighting were
used to improve the generalization ability of the
models. In addition, the verification set ratio and
batch size are consistent. The experimental results not
only reveal the performance differences between the
three models in the task of endangered animal
classification but also provide an important reference
for the application of deep learning in the field of
wildlife conservation.
The structure of the paper is as follows: The
second part introduces the data set construction and
preprocessing methods; The third section describes in
detail the architecture and training parameters of
MobileNet V1, MobileNet V2, and EfficientNet B0.
The fourth part analyzes the experimental results and
compares the performance of the three models.
Finally, the fifth part summarizes the research
conclusions and puts forward the future research
direction.
2 DATA AND METHOD
2.1 Data
This research involves methodically collecting and
merging images from accessible online sources to
construct a high-quality dataset of endangered
animals. Initially, this paper used a variety of
respected online image galleries and biodiversity
databases (such as the IUCN Red List or other
relevant platforms) as the primary source of images.
These systems provide a large number of images of
endangered species, accompanied by relevant
information, ensuring that the data sources are
scientific and authoritative. In addition, to enhance
the diversity and scope of the image library, this paper
used images from the public works of professional
photographers, which mainly highlight biodiversity
and realistically depict the species in their native
habitat. To address copyright and ethics concerns, this
paper has a rigorous selection process in place to
source images only from those that are clearly marked
with a license or public license, such as a Creative
Commons license. Subsequently, all collected photos
undergo a thorough manual verification process to
determine if they fit into the classification criteria of
the Endangered Animals dataset.
The dataset searched has a total of 250 images,
including five animals (Jaguar, Black-faced Black
Spider Monkey, Giant Otter, Blue-headed Macaw,
and Hyacinth Macaw), providing 50 photos of each
animal. Each picture requires about 120kb of
memory. The data set includes a variety of different
shooting angles and postures as well as different
lighting environments. Such a data set is more
diverse. During the data import process, the data were
enhanced and background information was removed
through cropping, allowing the model to focus more
on the characteristics of the target itself.
The dataset has been carefully developed to
ensure its diversity, authenticity, and high quality,
thus providing a solid foundation for the development
and validation of the taxonomic model for
endangered animals in this study. The production
process of this dataset combines multi-source
integration and scientific rigor, which provides a
large number of references for future research in this
field.
2.2 Introduction of MobileNet V1
In 2017, Google introduced MobileNet V1, a
lightweight convolutional neural network designed
for use in embedded systems and mobile devices.
MobileNet V1 differentiates conventional
convolution into depthwise convolution and
pointwise convolution, a method that significantly
reduces the number of parameters and processing
complexity, hence decreasing costs and storage
requirements. MobileNet V1 utilizes width multiplier
and resolution multiplier coefficients to adjust the
size and computational complexity of the model, in
contrast to traditional convolution methods. The
lightweight architecture of MobileNet V1 facilitates
low latency and little computational resource usage,
rendering it suitable for edge computing
environments and real-time applications.
2.3 Introduction of MobileNet V2
In 2017, Google introduced MobileNet V1, a
lightweight convolutional neural network designed