Application of CNNs in Feature Learning for Remote Sensing Data:

A Case Study on Land Cover Classification and Environmental

Change Detection

Zhaoyi Li

College of Food, Agriculture, and Nature Resource Science, University of Minnesota, Twin Cities, Minneapolis, U.S.A.

Keywords: Remote Sensing, Feature Extraction, Environmental Monitoring, Land Use Assessment.

Abstract: This study is a review and overview of the current training and feature analysis of convolutional neural

networks (CNNs) in remote sensing data, especially in the following areas: Environmental monitoring and

land use assessment. The aim of the study was to utilize information like high-resolution satellite imagery

from “Planet: Understanding the Amazon from Space” dataset to find ways to raise the accuracy of land cover

classification and environmental change detection. According to the materials of the cited articles, the paper

proposes an automatic CNN-based feature extraction method, which overcomes the limitations of traditional

manual methods. The method includes data preprocessing, multi-scale feature fusion, classification,

integration of attention mechanism, and further refinement of the performance of the residual network model.

The experimental results highlight that, a significant creep in classification accuracy which achieves at 93.5%

on areas of detecting deforested. Emphasizing the great potential of the proposed approach for real-time

environmental monitoring and land use planning, these results pave the way for the orientation of further

researches. The future work of the project will focus on optimizing CNN models to reduce computational

complexity, as well as exploring data fusion to improve the generalization and effectiveness of remote sensing

utilization from multiple sources.

1 INTRODUCTION

The deep learning model, convolutional neural

networks (CNNs), is designed to process data with

grid structures, such as satellite images. In recent

years, widespread application of CNNS has been

found in remote sensing data evaluation and high

dimensional and complex characteristics usually

come with it (Hu et.al, 2015). Traditional methods

tend to rely heavily on expert knowledge, struggle to

fully express features, and difficult to adapt to the

diverse, non-linear nature of the data (Tuia and et.al,

2016). However, as technology advances, the

emergence of CNNs offers new hope for automatic

feature learning, which can effectively capture

complex patterns in remote sensing data and obtain

more exact classification and evaluation (Chen et.al,

2016). Therefore, this study aims to comprehensively

examine and assess the application and development

of CNNs in the study of remote sensing data

https://orcid.org/0009-0005-6999-786X

characteristics. This study will reveal the transition

process from manual feature extraction to automatic

feature extraction, illuminating how this analysis of

remote sensing data. In addition, the topic of the

research is to fill a gap in systematic reviews in this

field and to guide future research and practice.

Remote sensing data, for example, includes

satellite images and drone images, are widely used in

environmental monitoring, including land use

analysis, disaster assessment and other fields. These

high-resolution data are usually multispectral,

hyperspectral, and contain rich high-precision spatial

and spectral information. In the field of analyzing

remote sensing data, traditional methods mainly rely

on manual feature extraction, such as texture analysis,

spectral index, shape features, etc. Although this

methods are suitable for specific tasks and data sets,

however, they are limited by their difficulty in

adapting to the diversity and complexity of the data.In

the last few decades, with the advancement of deep

learning methodologies, particularly the successful

354

Li and Z.

Application of CNNs in Feature Learning for Remote Sensing Data: A Case Study on Land Cover Classiﬁcation and Environmental Change Detection.

DOI: 10.5220/0013517600004619

In Proceedings of the 2nd International Conference on Data Analysis and Machine Learning (DAML 2024), pages 354-358

ISBN: 978-989-758-754-2

deployment of CNNs , researchers have begun to

introduce them into remote sensing data analysis

(Zhang et.al, 2016). CNNs work effectively in

processing high dimensional remote sensing data

because of its powerful automatic feature extraction

and learning ability. At present, most studies focus on

using CNNs for considerable advancements have

been achieved in the realm of remote sensing image

classification, change detection, and target

identification, yielding outstanding results. For

example, some studies promote the classification

precision and computational efficiency of the model

by improving the network structure and introducing

the attention mechanism (Hu et.al, 2018). Other

studies combined multi-scale features and multi-

source data to enhance the adaptability of CNNs to

complex scenarios (Ma et.al, 2010). In general, the

employment of CNNs in the examination of remote

sensing data has made a lot of progress, but it also

faces challenges such as difficulties in data annotation

and large consumption of computing resources.

Therefore, exploring more efficient and accurate deep

learning models is still the main purpose of future

research (Zhu et.al, 2017).

The principal aim of this study is examining the

application of CNNs in feature learning for remote

sensing data. In addition, it will investigate the core

technologies and future direction prospects in this

domain. The study begins with an introduction to key

concepts and provides basic knowledge about the use

of CNNs in CNN analysis remote sensing data

analysis. TIt then introduces the application scenarios

and advantages of CNNs in this field. A

comprehensive examination of the fundamental

technologies underlying CNNs will be presented,

encompassing their network architectures, training

methodologies, and enhancement strategies. After

that, the performance of these key technologies will

be demonstrated and evaluated in remote sensing data

classification and recognition tasks. By comparing

different models, the study will evaluate the merit and

demerit of different CNNs techniques revealing the

shortcomings and challenges of current research.

Based on aforementioned analysis, this article will

discuss future directions for the development of

CNNs in remote sensing. This paper summarizes the

research results, emphasizing the main contributions

of this research and outlining prospects for future

exploration in this area. The goal is to provide future

researchers with comprehensive reference and

guidance, and to promote further progress in the

application of refining techniques in remote sensing

data analysis.

2 METHODOLOGY

2.1 Dataset Description

The current data set from the Kaggle platform is

called "Planet: Understanding the Amazon from

Space" (Planet, 2017), which contains thousands of

satellite images covering different areas of the

Amazon rainforest. Every piece has different spectral

information, including visible and infrared

wavelengths. These data are mainly used to monitor

environmental changes in the Amazon region,

especially deforestation and land use change analysis.

That images are used in a number of specific

applications, including land cover classification,

ecosystem health assessment and environmental

disaster monitoring. The high-resolution images in

the dataset contains rich spatial and spectral

information, adding more difficulties to extraction.

CNNs enables researchers to automatically extract

features and perform classification tasks. By

processing these data, researchers can further

understand the impact of human activities on the

Amazon region and provide data support for

environmental protection policies.

2.2 Proposed Approach

The research leverages the “Planet: Understanding

the Amazon from Space” dataset from the Kaggle

platform and aspires to apply CNNs to feature

extraction, analyse remote sensing data, spotlight

land cover classification and environmental change

detection. CNNs are particularly useful when it

comes to processing complex remote sensing data,

because of their advanced feature extraction

proficiency. Figure 1 illustrates the overall workflow

of this study, highlighting essential stages like data

processing, model training, feature extraction,

classification, and performance evaluation. The first

step begins with processing the data to ensure that the

input images are formatted appropriately for the CNN

model. Once that foundation is set, spring into action,

extracting meaningful features from the images. In

the final stage, a classifier is employed to categorize

and recognize the features within the images,

ultimately leading to insightful evaluation results that

illuminate thesis’ understanding.

Application of CNNs in Feature Learning for Remote Sensing Data: A Case Study on Land Cover Classiﬁcation and Environmental Change

Detection

355

Figure 1: Research process (Picture credit: Original).

Figure 1 shows the entire research process from

data preprocessing to final classification result

evaluation. The process details how to process the

different stages of the Kaggle dataset and provides

clear steps for the execution of subsequent

experiments.

2.2.1 Introduction to Basic Technologies

As this study’s central technological part, CNNs

played a significant role in processing image data,

which can accurately and timely capture spatial

characteristics in images and avoid the shortcomings

of traditional artificial feature extraction methods (Hu

and et.al, 2015). The core structures of the CNN

model consists of convolutional layers, pooling

layers, and fully connected layers, which making the

model both powerful and effective. The convolution

layer captures local features in the image by using

different convolution kernels and mapping these

features to higher-level representations. The pooling

layer has the ability to decrease the dimensionality of

the image and preserve the key characteristics, thus

reducing the computational complexity. The fully

connected layer then sorts the features extracted

earlier.

CNNs, as a smart learning model, can be trained

to autonomously detect and retrieve complex features

in remote sensing images to improve the effectiveness

of land cover classification and environmental

monitoring tasks. In this experiment, the CNN model

is employed for automatic feature extraction and

categorization of remote sensing images and data in

the "Planet: Understanding the Amazon from Space"

dataset. This new method streamlines the analysis as

well as raises the ability to uncover valuable insights

about the Amazon and its intricate ecosystem. The

specific implementation process adopted in the

experiment is as follows: At the outset, the input

image is processed by multi-layer convolution, so as

to facilitate the extraction of multi-scale features.

Thereafter, a pool layer is implemented to compress

the dimensions of the feature map, and a fully

connected layer is used to classify and predict the

extracted features.

2.2.2 Mainstream Technology Model

The study focused on improving the performance of

CNNs. It introduced the Attention Mechanism, which

helps models better focus on key areas in the image,

thereby enhancing classification accuracy. The

process of remote sensing images often includes a lot

of irrelevant information, which in need of the

attention mechanism to concentrate more on useful

information by assigning different weights to each

input feature.

In this set of experiments, the attention

mechanism is incorporated into the convolutional

layer of the CNN model. The exact process is as

follows: First, weight vectors are generated through

global average pooling operation, and these weight

vectors are then allocated to the feature map to

enhance the feature representation of key regions.

This mechanism is particularly suitable for

multispectral image analysis, which can help the

model identify the most important feature regions in

different bands for classification tasks.

Figure 2: The basic structure of CNN with attention

mechanism (Picture credit: Original).

Figure 2 exhibits the basic structure of the

convolutional neural network combined with the

attention mechanism, which provides an intuitive

reference for the technical implementation in the

experiment.

2.2.3 Multi-Scale Feature Fusion

Remote sensing data usually contains rich multi-scale

information, and different scales represent different

spatial characteristics. Therefore, this study also

adopts Multi-scale Feature Fusion technology to

better capture the detailed information and global

information in the image (Marmanis et.al, 2016).

DAML 2024 - International Conference on Data Analysis and Machine Learning

356

In this experiment, the concrete realization of

multi-scale feature fusion is to extract image features

by using convolution kernel of different sizes (see in

Figure 3). Smaller convolution nuclei capture detailed

information, while larger one’s help capture global

context. The model is capable of extracting useful

features at multiple scales through the combination of

convolution operations at varying scales, thereby

enhancing the accuracy of the classification task. This

technique is especially advantageous when dealing

with a complex natural environment like the Amazon.

Figure 3: Structure of multi-scale feature fusion (Picture

credit: Original).

2.2.4 Residual Network

To further promote the performance of the model in

the deep network, the residual network (ResNet)

structure is introduced in this study (see in Figure 4).

By introducing Skip Connections, residual networks

effectively figure out gradient disappearance in deep

networks (Plaza and et.al, 2009). The residual

network allows the model to retain the original input

information during feature extraction, thereby

reducing information loss and improving the model's

performance in classification tasks.

In this experiment, the specific implementation of

the residual network is as follows: in each layer of

convolution operation, the output after convolution is

added with the original input in order to preserve the

low-level feature information. This structure makes

the model to better cope with complex patterns in

remote sensing images, particularly when the images

contain multiple layers of information. The

introduction of residual network significantly

improves the robustness and classification accuracy

of the model (Nalepa et.al, 2019).

Through the combination of the above techniques,

this study constructs an efficient convolutional neural

network framework, which can automatically learn

complex features in remote sensing data, and

provides an effective tool for environmental change

monitoring and land use analysis.

Figure 4: Structure of residual network (Picture credit:

Original).

3 RESULT AND DISCUSSION

3.1 Result Presentation

As shown in Table 1, the performance of different

models in land cover classification and environmental

change detection is different. The accuracy of a single

convolution kernel CNN is 85.2% and the recall rate

is 82.5%, but the performance is weak in complex

scenes. Multi-scale feature fusion CNN improves the

accuracy to 90.1% and the recall rate to 88.7% by

combining convolution nuclei of different sizes, but

the computation time is increased. After adding the

attention mechanism, the accuracy of the model was

further improved to 92.3%, especially in the

identification of deforestation areas. The model

combined with the residual network achieved the

highest accuracy of 93.5% and effectively prevented

the gradient disappearance problem in the deep

network, but the calculation time increased to 180

seconds.

Table 1: Performance of different models.

Model Accurac

Recall rate Precision F1-score Calculation time (s)

Sin

le convolution kernel CNN 85.20% 82.50% 83.00% 82.7 120

Multi-scale fusion CNN 90.10% 88.70% 89.30% 89 150

CNN with attention mechanis

92.30% 91.20% 91.20% 90.8 160

CNN with attention mechanism and ResNet 93.50% 92.50% 92.50% 92.1 180

Application of CNNs in Feature Learning for Remote Sensing Data: A Case Study on Land Cover Classiﬁcation and Environmental Change

Detection

357

3.2 Discussion

The performance of each model has its advantages

and disadvantages. CNN simplifies the complexity of

manual design through automatic feature extraction,

but standard convolution kernel is difficult to capture

multi-scale information. Multi-scale fusion

technology enhances the ability of the model to

capture information at different scales, but increases

the computational cost. The attention mechanism

further improves the classification effect and is

especially suitable for complex scene analysis. The

residual network solves the gradient disappearance

problem of the deep network by jumping connection,

but the computational complexity is high. Further

research may seek to implement lightweight models

in order to reduce the computational overhead, or to

enhance the generalisation ability of models through

the fusion of multi-source data (Howard et.al, 2017).

Optimizing attention mechanisms could also be one

direction. These technologies have the potential to be

widely used in tasks such as environmental

monitoring and disaster warning, but need to solve the

problem of computational complexity (Brock et.al,

2021).

4 CONCLUSIONS

This study presents the application of CNNs in the

field of feature learning and evaluation of remote

sensing data. It particularly emphasized on improving

the accuracy of land cover classification and

environmental change detection using high-

resolution satellite images. It also proposed a CNN-

based automatic feature extraction method,

addressing the limitations of traditional manual

approaches. The model pipeline includes key steps

such as data preprocessing and multi-scale feature

fusion, incorporating attention mechanisms and

residual networks. The research conducted a series of

comprehensive experiments. These experiments were

designed to assess the performance of the proposed

method, achieving a significant improvement in

classification accuracy, with results reaching up to

93.5%, particularly excelling in deforestation

detection. Future research will focus on optimizing

lightweight models to reduce computational

complexity and integrating multi-source data to

enhance the model's generalization capabilities.

Additionally, further optimization of attention

mechanisms will be explored to enable more precise

image analysis, thereby improving the efficiency of

environmental monitoring tasks.

REFERENCES

Brock, A., De, S., Smith, S. L., & Simonyan, K., 2021.

High-Performance Large-Scale Image Recognition

Without Normalization. arXiv print: 2102.06171.

Chen, Y., Jiang, H., Li, C., Jia, X., & Ghamisi, P., 2016.

Deep Feature Extraction and Classification of

Hyperspectral Images Based on Convolutional Neural

Networks. IEEE Transactions on Geoscience and

Remote Sensing, 54(10), 6232–6251.

Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang,

W., Weyand, T., Andreetto, M., & Adam, H., 2017.

MobileNets: Efficient Convolutional Neural Networks

for Mobile Vision Applications. arXiv print:

1704.04861.

Hu, J., Shen, L., & Sun, G., 2018. Squeeze-and-Excitation

Networks. IEEE/CVF Conference on Computer Vision

and Pattern Recognition, 7132–7141.

Hu, W., Huang, Y., Wei, L., Zhang, F., & Li, H., 2015.

Deep Convolutional Neural Networks for

Hyperspectral Image Classification. Journal of Sensors,

2015(1), 258619.

Ma, L., Crawford, M. M., & Tian, J., 2010. Local Manifold

Learning-Based k -Nearest-Neighbor for Hyperspectral

Image Classification. IEEE Transactions on

Geoscience and Remote Sensing, 48(11), 4099–4109.

Marmanis, D., Datcu, M., Esch, T., & Stilla, U., 2016. Deep

Learning Earth Observation Classification Using

ImageNet Pretrained Networks. IEEE Geoscience and

Remote Sensing Letters, 13(1), 105–109.

Nalepa, J., & Kawulok, M., 2019. Selecting training sets for

support vector machines: A review. Artificial

Intelligence Review, 52(2), 857–900.

Planet, 2017. Planet: Understanding the Amazon from

Space. Retrieved on 2024, Retrieved from:

https://www.kaggle.com/competitions/planet-

understanding-the-amazon-from-space

Plaza, A., Benediktsson, J. A., Boardman, J. W., Brazile, J.,

Bruzzone, L., Camps-Valls, G., Chanussot, J., Fauvel,

M., Gamba, P., Gualtieri, A., Marconcini, M., Tilton, J.

C., & Trianni, G., 2009. Recent advances in techniques

for hyperspectral image processing. Remote Sensing of

Environment, 113, S110–S122.

Tuia, D., Persello, C., & Bruzzone, L., 2016. Domain

Adaptation for the Classification of Remote Sensing

Data: An Overview of Recent Advances. IEEE

Geoscience and Remote Sensing Magazine, 4, 41–57.

Zhang, L., Zhang, L., & Du, B., 2016. Deep Learning for

Remote Sensing Data: A Technical Tutorial on the

State of the Art. IEEE Geoscience and Remote Sensing

Magazine, 4(2), 22–40.

Zhu, X.X., Tuia, D., Mou, L., Xia, G.S., Zhang, L., Xu, F.,

& Fraundorfer, F., 2017. Deep Learning in Remote

Sensing: A Comprehensive Review and List of

Resources. IEEE Geoscience and Remote Sensing

Magazine, 5(4), 8–36.

DAML 2024 - International Conference on Data Analysis and Machine Learning

358