model can process large-scale graph data more
quickly or effectively. In this way, although the main
framework and structure of the general diagram can
be preserved, the pooling operation is essentially a
down-sampling process, so some information will
inevitably be lost during the pooling process.
Moreover, it is not difficult to find that the importance
of the pooling layer to the features in the diagram
lacks the weight comparison, which may seriously
cause the loss of important features, thus affecting the
normal progress of the following steps. In particular,
the accuracy and performance of the model may be
affected when the structure of some complex and
detailed graph data is rich. In addition, due to
technical limitations at that time, the initial pooling
operation often only focused on the feature
aggregation of local areas, and the ability to capture
the global structure information of the graph was
limited. For some tasks requiring global information
to make decisions, the analysis results might be
biased due to the inability to process and analyse all
the images.
For this reason, the model was optimized and the
ASA Pooling was used to solve the problem of
information loss. Its adaptive structure can learn a soft
distribution matrix, and the nodes can be allocated to
different clusters for pooling, which can more
effectively retain the structure and feature
information of the graph and reduce information loss.
Of course, the Diff Pool method preserves important
nodes and their surroundings through a clustering
algorithm to reduce information loss of important
nodes as much as possible, but the overall effect is
more accurate with ASA Pooling. Faced with the
problem of information analysis of global structure,
people put forward the Eigen Pooling algorithm,
which uses Laplacian matrix to pool the features of
the global structure information of a graph,
reasonably split the graph into subgraphs, and better
extract the features and learn the representation of the
global information through their respective analysis
(Ranjan, Sanyal, Talukdar,2022).
4.2.3 Fully Connected Layer
The fully connected layer in the convolutional neural
network is very similar to the hidden layer in the
traditional feedforward neural network, both in
function and structure. The fully connected layer is at
the end of the hidden layer of the convolutional neural
network, and the signal transmission direction is
relatively simple and only transmits signals to other
fully connected layers. From the perspective of
representation learning, the convolutional layer and
the pooling layer in the convolutional neural network
are mainly responsible for feature extraction of the
input data. In contrast, the core function of the fully
connected layer is to combine the features extracted
from the previous convolution layer and pooling layer
to generate the output result. In other words, the fully
connected layer itself does not focus on the ability of
feature extraction but focuses on how to skilfully
integrate the extracted features. After this series of
operations, the output of the entire network is finally
generated. For example, when you encounter a
5×5×16 feature graph, it means that it has 5 pixel units
in each direction of length and width and has 16
channels. Global mean pooling processes each of
these 16 channels separately. Global mean pooling
will return a vector of 16 where each element is 5×5,
step size 5, and mean pooling without padding. When
the fully connected layer receives the features
extracted by the previous convolution layer and
pooling layer, it will improve the feature fusion of all
nodes and comprehensively consider the global
information to provide comprehensive feature display
for the final classification task (Alzubaidi, Zhang,
Humaidi, 2021) (Sun, Xue, Zhang, 2019).
Because of its simple structure and strong
versatility, this layer is easy to combine and integrate
with other types of neural network layers or models,
and it is also easy to integrate with other machine
learning or deep learning models to expand the
function and application scope of the model.
However, the fully connected layer itself needs to
receive a large amount of information from the
convolutional layer and the pooled layer, which
consumes a lot of time and resources in model
training and reasoning. If too many models are added,
the number of parameters may be too large, and the
whole fully connected layer will overfit the training
data, which will lead to poor learning performance of
the entire convolutional network.
In order to reduce the parameters that need to be
calculated. Low-rank Approximation technology is
proposed to decompose the weight matrix of the fully
connected layer into the product of two low-rank
matrices. So that fewer parameters can be used to
approximate the original weight matrix, thus reducing
the amount of calculation and the number of
parameters of the model. Of course, some people also
proposed the method of Sparse FC, which uses sparse
connections to build a fully connected layer so that
each neuron needs to be connected to the upper layer
of neurons to only connect the part, which can also
reduce the number of parameters and reduce the
demand for resources. However, although the latter
method is simple, it will abandon the accuracy to