This tool consists of adjusting an initial model,
seeking to improve its efficiency. The main objective
is to select the correct results, seeking to improve the
results that were not successful, through their
correction for subsequent models. It is worth
mentioning that this model requires several classifiers
with low accuracy, to create a more efficient variable
(Carmo, 2021).
Similarities can be identified in the predictor
variables used in the previously quoted study and the
present research since the variables analyzed by him
were: the size of the vessel, the engine power (in HP),
the place of origin and destination, departure date and
arrival date, the amount of fuel at departure and
arrival and the miles traveled. The evaluation of the
metrics of the developed model was simpler, with an
RMSE of 16.71 and an R² of 0.924, against an RMSE
of the current model analyzed of 32.99 and an R² of
0.892.
At the end of the work, the author did not present
a final equation, however, it reinforces the weight that
the variables analyzed and the influence that the
analysis of the algorithms had on the influence of the
fuel.
Another way of verifying the applicability of the
model is Backtesting, which consists of an analysis of
a series of pre-existing data. This test can identify the
behavior of the information, being fundamental in
predicting trends in the sample in question (Vezeris et
al., 2018).
This model is one of the main ones for outlining
strategies in the financial market or logistical
analyses, seeking to select the best decisions for
analysis (Bailey et al., 2016).
A study made by Takahashi developed a backtest
simulating financial return scenarios in 4 (four)
strategies adopted over 10 years by 34 different
companies. The strategies were linked to the grace
period of the titles acquired (Takahashi et al., 2021).
Through the backtesting carried out, it was
possible to identify the profitability of each one and
analyze its behavior within the historical series.
Long-term strategies presented an average annual
return of 12.91% against 4.83%, showing the
importance of this test to validate developed models.
One of the main products of automated analysis is
cost-effectiveness. In our study, the cost of the
difference in waste between the models will generate
savings of $1,435.00 per trip made (Table 7).
Duarte conducted research to calibrate some
inertial instruments, such as gyroscopes, using MLR.
The results brought efficiency to the navigation
system, improving the reading of results and,
consequently, the distribution of signals. All this
efficiency results in the reduction of direct and
indirect costs (Duarte et al., 2020).
Another interesting work is the analysis focusing
on predicting the weather seasons in the region of
India, characterized by great unpredictability in
natural phenomena, wrote by Shaker and Sureshbabu.
This peculiarity contributes to poor resource
management and decision-making regarding
calamities for farmers in the region. The model
developed was able to surpass all existing ones and
brought greater economic efficiency to the population
since there would be a more efficient allocation of
financial amounts (Shaker, Sureshbabu, 2020).
A study that presents great similarities with ours
is the research of the fuel consumption of a marine
vessel en route also using machine learning, by Hu et
al. Due to the characteristics of this type of
navigation, the authors considered variables such as
wind speed, wave height, fuel recording in real-time
every 15 (fifteen) minutes, the vessel's draft, and the
direction of the currents (which can be at any sense,
not for and against, as they are in rivers). To carry out
this analysis, they used Neural Networks and
Gaussian Process Regression. Both aim to analyze a
set of data, carry out proper training, and predict the
data set (Hu et al., 2019).
The metrics evaluated were MSE, RMSE, MAE,
and R². Through these, the authors compare the
differences with different samples, showing their
evolution with a broader set of data, such as a
backtest.
By way of comparison, this study was able to
demonstrate that R² evolved significantly, as the
amount of data fed into the Machine Learning
algorithm database, with a difference from 0.782 to
0.870.
Considering the above study, it can be seen that
the author managed to achieve an R² greater than 0.98
in both models, the result of a much more detailed
historical analysis, with an interval of 15 minutes.
However, it is worth remembering that this study has
a different aspect, as it concerns maritime navigation,
but points to the same direction as the basis of this
work.
Another study is Reis’ analysis with the linear
regression algorithm in a study to identify the
attitudinal factors that influence the purchase of
remanufactured products. The scope of his work was
to develop a relationship between the independent
variables, represented by attitudes, and the dependent
variable, which is the acquisition of this type of input.
The identification of the factors took place through a
thorough literature review, as was the case in this
study, searching for references that had already