Application of CNNs in Feature Learning for Remote Sensing Data:
A Case Study on Land Cover Classification and Environmental
Change Detection
Zhaoyi Li
a
College of Food, Agriculture, and Nature Resource Science, University of Minnesota, Twin Cities, Minneapolis, U.S.A.
Keywords: Remote Sensing, Feature Extraction, Environmental Monitoring, Land Use Assessment.
Abstract: This study is a review and overview of the current training and feature analysis of convolutional neural
networks (CNNs) in remote sensing data, especially in the following areas: Environmental monitoring and
land use assessment. The aim of the study was to utilize information like high-resolution satellite imagery
from “Planet: Understanding the Amazon from Space” dataset to find ways to raise the accuracy of land cover
classification and environmental change detection. According to the materials of the cited articles, the paper
proposes an automatic CNN-based feature extraction method, which overcomes the limitations of traditional
manual methods. The method includes data preprocessing, multi-scale feature fusion, classification,
integration of attention mechanism, and further refinement of the performance of the residual network model.
The experimental results highlight that, a significant creep in classification accuracy which achieves at 93.5%
on areas of detecting deforested. Emphasizing the great potential of the proposed approach for real-time
environmental monitoring and land use planning, these results pave the way for the orientation of further
researches. The future work of the project will focus on optimizing CNN models to reduce computational
complexity, as well as exploring data fusion to improve the generalization and effectiveness of remote sensing
utilization from multiple sources.
1 INTRODUCTION
The deep learning model, convolutional neural
networks (CNNs), is designed to process data with
grid structures, such as satellite images. In recent
years, widespread application of CNNS has been
found in remote sensing data evaluation and high
dimensional and complex characteristics usually
come with it (Hu et.al, 2015). Traditional methods
tend to rely heavily on expert knowledge, struggle to
fully express features, and difficult to adapt to the
diverse, non-linear nature of the data (Tuia and et.al,
2016). However, as technology advances, the
emergence of CNNs offers new hope for automatic
feature learning, which can effectively capture
complex patterns in remote sensing data and obtain
more exact classification and evaluation (Chen et.al,
2016). Therefore, this study aims to comprehensively
examine and assess the application and development
of CNNs in the study of remote sensing data
a
https://orcid.org/0009-0005-6999-786X
characteristics. This study will reveal the transition
process from manual feature extraction to automatic
feature extraction, illuminating how this analysis of
remote sensing data. In addition, the topic of the
research is to fill a gap in systematic reviews in this
field and to guide future research and practice.
Remote sensing data, for example, includes
satellite images and drone images, are widely used in
environmental monitoring, including land use
analysis, disaster assessment and other fields. These
high-resolution data are usually multispectral,
hyperspectral, and contain rich high-precision spatial
and spectral information. In the field of analyzing
remote sensing data, traditional methods mainly rely
on manual feature extraction, such as texture analysis,
spectral index, shape features, etc. Although this
methods are suitable for specific tasks and data sets,
however, they are limited by their difficulty in
adapting to the diversity and complexity of the data.In
the last few decades, with the advancement of deep
learning methodologies, particularly the successful
354
Li and Z.
Application of CNNs in Feature Learning for Remote Sensing Data: A Case Study on Land Cover Classification and Environmental Change Detection.
DOI: 10.5220/0013517600004619
In Proceedings of the 2nd International Conference on Data Analysis and Machine Learning (DAML 2024), pages 354-358
ISBN: 978-989-758-754-2
Copyright © 2025 by Paper published under CC license (CC BY-NC-ND 4.0)
deployment of CNNs , researchers have begun to
introduce them into remote sensing data analysis
(Zhang et.al, 2016). CNNs work effectively in
processing high dimensional remote sensing data
because of its powerful automatic feature extraction
and learning ability. At present, most studies focus on
using CNNs for considerable advancements have
been achieved in the realm of remote sensing image
classification, change detection, and target
identification, yielding outstanding results. For
example, some studies promote the classification
precision and computational efficiency of the model
by improving the network structure and introducing
the attention mechanism (Hu et.al, 2018). Other
studies combined multi-scale features and multi-
source data to enhance the adaptability of CNNs to
complex scenarios (Ma et.al, 2010). In general, the
employment of CNNs in the examination of remote
sensing data has made a lot of progress, but it also
faces challenges such as difficulties in data annotation
and large consumption of computing resources.
Therefore, exploring more efficient and accurate deep
learning models is still the main purpose of future
research (Zhu et.al, 2017).
The principal aim of this study is examining the
application of CNNs in feature learning for remote
sensing data. In addition, it will investigate the core
technologies and future direction prospects in this
domain. The study begins with an introduction to key
concepts and provides basic knowledge about the use
of CNNs in CNN analysis remote sensing data
analysis. TIt then introduces the application scenarios
and advantages of CNNs in this field. A
comprehensive examination of the fundamental
technologies underlying CNNs will be presented,
encompassing their network architectures, training
methodologies, and enhancement strategies. After
that, the performance of these key technologies will
be demonstrated and evaluated in remote sensing data
classification and recognition tasks. By comparing
different models, the study will evaluate the merit and
demerit of different CNNs techniques revealing the
shortcomings and challenges of current research.
Based on aforementioned analysis, this article will
discuss future directions for the development of
CNNs in remote sensing. This paper summarizes the
research results, emphasizing the main contributions
of this research and outlining prospects for future
exploration in this area. The goal is to provide future
researchers with comprehensive reference and
guidance, and to promote further progress in the
application of refining techniques in remote sensing
data analysis.
2 METHODOLOGY
2.1 Dataset Description
The current data set from the Kaggle platform is
called "Planet: Understanding the Amazon from
Space" (Planet, 2017), which contains thousands of
satellite images covering different areas of the
Amazon rainforest. Every piece has different spectral
information, including visible and infrared
wavelengths. These data are mainly used to monitor
environmental changes in the Amazon region,
especially deforestation and land use change analysis.
That images are used in a number of specific
applications, including land cover classification,
ecosystem health assessment and environmental
disaster monitoring. The high-resolution images in
the dataset contains rich spatial and spectral
information, adding more difficulties to extraction.
CNNs enables researchers to automatically extract
features and perform classification tasks. By
processing these data, researchers can further
understand the impact of human activities on the
Amazon region and provide data support for
environmental protection policies.
2.2 Proposed Approach
The research leverages the “Planet: Understanding
the Amazon from Space” dataset from the Kaggle
platform and aspires to apply CNNs to feature
extraction, analyse remote sensing data, spotlight
land cover classification and environmental change
detection. CNNs are particularly useful when it
comes to processing complex remote sensing data,
because of their advanced feature extraction
proficiency. Figure 1 illustrates the overall workflow
of this study, highlighting essential stages like data
processing, model training, feature extraction,
classification, and performance evaluation. The first
step begins with processing the data to ensure that the
input images are formatted appropriately for the CNN
model. Once that foundation is set, spring into action,
extracting meaningful features from the images. In
the final stage, a classifier is employed to categorize
and recognize the features within the images,
ultimately leading to insightful evaluation results that
illuminate thesis’ understanding.
Application of CNNs in Feature Learning for Remote Sensing Data: A Case Study on Land Cover Classification and Environmental Change
Detection
355
Figure 1: Research process (Picture credit: Original).
Figure 1 shows the entire research process from
data preprocessing to final classification result
evaluation. The process details how to process the
different stages of the Kaggle dataset and provides
clear steps for the execution of subsequent
experiments.
2.2.1 Introduction to Basic Technologies
As this study’s central technological part, CNNs
played a significant role in processing image data,
which can accurately and timely capture spatial
characteristics in images and avoid the shortcomings
of traditional artificial feature extraction methods (Hu
and et.al, 2015). The core structures of the CNN
model consists of convolutional layers, pooling
layers, and fully connected layers, which making the
model both powerful and effective. The convolution
layer captures local features in the image by using
different convolution kernels and mapping these
features to higher-level representations. The pooling
layer has the ability to decrease the dimensionality of
the image and preserve the key characteristics, thus
reducing the computational complexity. The fully
connected layer then sorts the features extracted
earlier.
CNNs, as a smart learning model, can be trained
to autonomously detect and retrieve complex features
in remote sensing images to improve the effectiveness
of land cover classification and environmental
monitoring tasks. In this experiment, the CNN model
is employed for automatic feature extraction and
categorization of remote sensing images and data in
the "Planet: Understanding the Amazon from Space"
dataset. This new method streamlines the analysis as
well as raises the ability to uncover valuable insights
about the Amazon and its intricate ecosystem. The
specific implementation process adopted in the
experiment is as follows: At the outset, the input
image is processed by multi-layer convolution, so as
to facilitate the extraction of multi-scale features.
Thereafter, a pool layer is implemented to compress
the dimensions of the feature map, and a fully
connected layer is used to classify and predict the
extracted features.
2.2.2 Mainstream Technology Model
The study focused on improving the performance of
CNNs. It introduced the Attention Mechanism, which
helps models better focus on key areas in the image,
thereby enhancing classification accuracy. The
process of remote sensing images often includes a lot
of irrelevant information, which in need of the
attention mechanism to concentrate more on useful
information by assigning different weights to each
input feature.
In this set of experiments, the attention
mechanism is incorporated into the convolutional
layer of the CNN model. The exact process is as
follows: First, weight vectors are generated through
global average pooling operation, and these weight
vectors are then allocated to the feature map to
enhance the feature representation of key regions.
This mechanism is particularly suitable for
multispectral image analysis, which can help the
model identify the most important feature regions in
different bands for classification tasks.
Figure 2: The basic structure of CNN with attention
mechanism (Picture credit: Original).
Figure 2 exhibits the basic structure of the
convolutional neural network combined with the
attention mechanism, which provides an intuitive
reference for the technical implementation in the
experiment.
2.2.3 Multi-Scale Feature Fusion
Remote sensing data usually contains rich multi-scale
information, and different scales represent different
spatial characteristics. Therefore, this study also
adopts Multi-scale Feature Fusion technology to
better capture the detailed information and global
information in the image (Marmanis et.al, 2016).
DAML 2024 - International Conference on Data Analysis and Machine Learning
356
In this experiment, the concrete realization of
multi-scale feature fusion is to extract image features
by using convolution kernel of different sizes (see in
Figure 3). Smaller convolution nuclei capture detailed
information, while larger one’s help capture global
context. The model is capable of extracting useful
features at multiple scales through the combination of
convolution operations at varying scales, thereby
enhancing the accuracy of the classification task. This
technique is especially advantageous when dealing
with a complex natural environment like the Amazon.
Figure 3: Structure of multi-scale feature fusion (Picture
credit: Original).
2.2.4 Residual Network
To further promote the performance of the model in
the deep network, the residual network (ResNet)
structure is introduced in this study (see in Figure 4).
By introducing Skip Connections, residual networks
effectively figure out gradient disappearance in deep
networks (Plaza and et.al, 2009). The residual
network allows the model to retain the original input
information during feature extraction, thereby
reducing information loss and improving the model's
performance in classification tasks.
In this experiment, the specific implementation of
the residual network is as follows: in each layer of
convolution operation, the output after convolution is
added with the original input in order to preserve the
low-level feature information. This structure makes
the model to better cope with complex patterns in
remote sensing images, particularly when the images
contain multiple layers of information. The
introduction of residual network significantly
improves the robustness and classification accuracy
of the model (Nalepa et.al, 2019).
Through the combination of the above techniques,
this study constructs an efficient convolutional neural
network framework, which can automatically learn
complex features in remote sensing data, and
provides an effective tool for environmental change
monitoring and land use analysis.
Figure 4: Structure of residual network (Picture credit:
Original).
3 RESULT AND DISCUSSION
3.1 Result Presentation
As shown in Table 1, the performance of different
models in land cover classification and environmental
change detection is different. The accuracy of a single
convolution kernel CNN is 85.2% and the recall rate
is 82.5%, but the performance is weak in complex
scenes. Multi-scale feature fusion CNN improves the
accuracy to 90.1% and the recall rate to 88.7% by
combining convolution nuclei of different sizes, but
the computation time is increased. After adding the
attention mechanism, the accuracy of the model was
further improved to 92.3%, especially in the
identification of deforestation areas. The model
combined with the residual network achieved the
highest accuracy of 93.5% and effectively prevented
the gradient disappearance problem in the deep
network, but the calculation time increased to 180
seconds.
Table 1: Performance of different models.
Model Accurac
y
Recall rate Precision F1-score Calculation time (s)
Sin
g
le convolution kernel CNN 85.20% 82.50% 83.00% 82.7 120
Multi-scale fusion CNN 90.10% 88.70% 89.30% 89 150
CNN with attention mechanis
m
92.30% 91.20% 91.20% 90.8 160
CNN with attention mechanism and ResNet 93.50% 92.50% 92.50% 92.1 180
Application of CNNs in Feature Learning for Remote Sensing Data: A Case Study on Land Cover Classification and Environmental Change
Detection
357
3.2 Discussion
The performance of each model has its advantages
and disadvantages. CNN simplifies the complexity of
manual design through automatic feature extraction,
but standard convolution kernel is difficult to capture
multi-scale information. Multi-scale fusion
technology enhances the ability of the model to
capture information at different scales, but increases
the computational cost. The attention mechanism
further improves the classification effect and is
especially suitable for complex scene analysis. The
residual network solves the gradient disappearance
problem of the deep network by jumping connection,
but the computational complexity is high. Further
research may seek to implement lightweight models
in order to reduce the computational overhead, or to
enhance the generalisation ability of models through
the fusion of multi-source data (Howard et.al, 2017).
Optimizing attention mechanisms could also be one
direction. These technologies have the potential to be
widely used in tasks such as environmental
monitoring and disaster warning, but need to solve the
problem of computational complexity (Brock et.al,
2021).
4 CONCLUSIONS
This study presents the application of CNNs in the
field of feature learning and evaluation of remote
sensing data. It particularly emphasized on improving
the accuracy of land cover classification and
environmental change detection using high-
resolution satellite images. It also proposed a CNN-
based automatic feature extraction method,
addressing the limitations of traditional manual
approaches. The model pipeline includes key steps
such as data preprocessing and multi-scale feature
fusion, incorporating attention mechanisms and
residual networks. The research conducted a series of
comprehensive experiments. These experiments were
designed to assess the performance of the proposed
method, achieving a significant improvement in
classification accuracy, with results reaching up to
93.5%, particularly excelling in deforestation
detection. Future research will focus on optimizing
lightweight models to reduce computational
complexity and integrating multi-source data to
enhance the model's generalization capabilities.
Additionally, further optimization of attention
mechanisms will be explored to enable more precise
image analysis, thereby improving the efficiency of
environmental monitoring tasks.
REFERENCES
Brock, A., De, S., Smith, S. L., & Simonyan, K., 2021.
High-Performance Large-Scale Image Recognition
Without Normalization. arXiv print: 2102.06171.
Chen, Y., Jiang, H., Li, C., Jia, X., & Ghamisi, P., 2016.
Deep Feature Extraction and Classification of
Hyperspectral Images Based on Convolutional Neural
Networks. IEEE Transactions on Geoscience and
Remote Sensing, 54(10), 6232–6251.
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang,
W., Weyand, T., Andreetto, M., & Adam, H., 2017.
MobileNets: Efficient Convolutional Neural Networks
for Mobile Vision Applications. arXiv print:
1704.04861.
Hu, J., Shen, L., & Sun, G., 2018. Squeeze-and-Excitation
Networks. IEEE/CVF Conference on Computer Vision
and Pattern Recognition, 7132–7141.
Hu, W., Huang, Y., Wei, L., Zhang, F., & Li, H., 2015.
Deep Convolutional Neural Networks for
Hyperspectral Image Classification. Journal of Sensors,
2015(1), 258619.
Ma, L., Crawford, M. M., & Tian, J., 2010. Local Manifold
Learning-Based k -Nearest-Neighbor for Hyperspectral
Image Classification. IEEE Transactions on
Geoscience and Remote Sensing, 48(11), 4099–4109.
Marmanis, D., Datcu, M., Esch, T., & Stilla, U., 2016. Deep
Learning Earth Observation Classification Using
ImageNet Pretrained Networks. IEEE Geoscience and
Remote Sensing Letters, 13(1), 105–109.
Nalepa, J., & Kawulok, M., 2019. Selecting training sets for
support vector machines: A review. Artificial
Intelligence Review, 52(2), 857–900.
Planet, 2017. Planet: Understanding the Amazon from
Space. Retrieved on 2024, Retrieved from:
https://www.kaggle.com/competitions/planet-
understanding-the-amazon-from-space
Plaza, A., Benediktsson, J. A., Boardman, J. W., Brazile, J.,
Bruzzone, L., Camps-Valls, G., Chanussot, J., Fauvel,
M., Gamba, P., Gualtieri, A., Marconcini, M., Tilton, J.
C., & Trianni, G., 2009. Recent advances in techniques
for hyperspectral image processing. Remote Sensing of
Environment, 113, S110–S122.
Tuia, D., Persello, C., & Bruzzone, L., 2016. Domain
Adaptation for the Classification of Remote Sensing
Data: An Overview of Recent Advances. IEEE
Geoscience and Remote Sensing Magazine, 4, 41–57.
Zhang, L., Zhang, L., & Du, B., 2016. Deep Learning for
Remote Sensing Data: A Technical Tutorial on the
State of the Art. IEEE Geoscience and Remote Sensing
Magazine, 4(2), 22–40.
Zhu, X.X., Tuia, D., Mou, L., Xia, G.S., Zhang, L., Xu, F.,
& Fraundorfer, F., 2017. Deep Learning in Remote
Sensing: A Comprehensive Review and List of
Resources. IEEE Geoscience and Remote Sensing
Magazine, 5(4), 8–36.
DAML 2024 - International Conference on Data Analysis and Machine Learning
358