2 METHOD AND DATA
2.1 ANN
ANN is an algorithm utilising distributed parallel
information processing inspired by simulating the
structure and function of biological neural networks.
It mainly consists of input layer, hidden layer, and
output layer. Every layer is connected by neurons,
which transform the signals. There are many
definitions of neurons. This paper employs the most
commonly used McCulloch-Pitts Model. In this
definition, each neuron of the latter layer will give a
specific weight to neurons in the former. When
signals are transmitted between layers, the weighted
summation of the signals in the former layer will be
transformed nonlinearly through an activation
function, where the result is obtained.
ANN is an algorithm applicable to credit card
detection tasks. Rizki, Surjandari, and Wayasti (2017)
applied ANN and SVM to detect financial fraud in
Indonesian listed companies. The result shows that
ANN has a 90.97% precision on data sets without
feature selections, higher than that of SVM. Lin,
Chiu, Huang, and Yen (2015) applied ANN, logistic
regression and CART to identify fraud in financial
statements. The result shows that ANN outperforms
all other algorithms, approaching a 92.8% precision
on the test set. Sahin and Duman (2011) discovered
that ANN performs better than logistic regression. In
conclusion, ANN performs well in financial fraud
detection, indicating that ANN is a viable option for
CCFD. The training processes for ANN are as follows
in Figure 1.
The five procedures of ANN training include
forward propagation, error calculation, backward
propagation, gradient descent and iterative update. To
begin with, a prediction result is obtained from the
ANN by inputting a set of data. After that, the loss
function is calculated using the prediction and actual
values. The third step is backward propagation, which
refers to calculating the gradient of each layer to get
the partial derivative of the loss function concerning
the coefficients of variables in the input layer. Finally,
optimisation algorithms such as gradient descent are
used to modify the weights and biases by minimising
the loss function. The process will be repeated several
times until the termination condition is met.
2.2 XGBoost
Chen and Guestrin developed the XGBoost algorithm
in 2016. XGBoost is an ensemble learning algorithm
based on Boosting. The framework is to ensemble
multiple decision trees to construct a strong learner
and reduce error. In the training process, the weak
learner in the latter model is used to predict the
residual error in the former model to minimise the
error. When the error is decreased to a specific
threshold, the results in each weak model will be
added to get the final prediction. Compared to
traditional Gradient Boosting Decision Tree
algorithms (GBDT), XGBoost improves significantly
in many aspects, especially in the object function.
Usually, the object function of GBDT only contains
the loss function that is approximated using first-
order Taylor Expansion. XGBoost improves the
object function in the following two parts. Firstly,
regularisation terms are added as a penalty for the
complexity to prevent overfitting, enabling the model
to maintain high predicting accuracy and good
generalisation ability. Second-order Taylor
Expansion is utilised to approximate the loss
function, describing the function change more
precisely, capturing more information about the
learning rate and constructing a more robust model.
XGBoost performs well in many classification
problems, as many researchers have confirmed.
Priscilla and Prabha (2020) obtained OXGBoost by
optimising the XGBoost model, discovering that the
new model demonstrates comparatively high
precision when dealing with imbalanced data. Liew,
Hameed, and Clos (2021) combined deep-learning
feature selection methods with the XGBoost classifier
and applied them to the breast cancer classification
task to identify cancerous cells. The result shows that
the XGBoost’s performance is remarkable. Hajek,
Abedin, and Sivarajah (2023) presented a fraud
identification framework based on XGBoost. They
compared it with many other advanced machine-
learning methods, discovering that XGBoost
performs better than machine-learning and
unsupervised techniques. In summary, XGBoost
Figure 1: Flowchart of ANN Training.