focused on generic market pricing, leaving out most 
of  the  critical  gaps  in  the  sector-specific  prediction 
(Kim  &  Kang,  2019).  In  most  cases,  statistical 
methods  cannot  define  the  very  complex  and  non-
linear relationships between different variables of the 
market and the price of stock (Zhong & Enke, 2017). 
In addition, existing models fail to include industry-
specific  knowledge,  a  vital  factor  for  accurate 
prediction  in  the  manufacturing  sector  (Chandola, 
Banerjee, & Kumar, 2009; Oztekin et al., 2016). 
It is also known that previous research has limited 
model interpretability and practical application (Hsu 
et  al.,  2026;  Sezer,  Gudelek,  &  Ozbayoglu,  2020). 
Although some researchers are able to achieve high 
accuracy rates in controlled environments, the same 
cannot  be  said  for  real-world  performance,  as  they 
tend  to  assume  inadequate  consideration  of  market 
microstructure  and  industry-specific  factors 
(Khaidem,  Saha  &  Dey,  2016).  Moreover,  the 
absence  of  a  framework  of  robust  evaluation  of 
prediction  systems  that  includes  the  accuracy  of 
prediction  as  well  as  the  stability  of  the  model  has 
inhibited  progress  in  developing  the  reliable 
prediction systems (Kumar et al., 2016). 
In  this  study,  the  stock  price  prediction  is 
investigated  within  the  electrical  manufacturing 
sector  based  on  a  multi-stage  framework  aimed  to 
improve  the  accuracy  and  stability  of  the  resulting 
model.  In  the  framework,  there  are  several  major 
components  such  as  data  preprocessing,  feature 
engineering, model training, performance evaluation, 
and predictive analysis, which I compare three of the 
machine  learning  models:  Random  Forest  (RF), 
XGBoost,  and  Gradient  Boosting  Decision  Trees 
(GBDT)  based  on  R²  values,  volatility,  and  trend 
stability. The results show that RF is better than the 
rest of the models in terms of accuracy and robustness 
in prediction and can serve as a preferred choice for 
short- and long-term prediction. 
2  METHODOLOGY 
2.1  Data Collection and Sources 
As expounded by Lin et al., the type of data collection 
is  complete  when  it  uses  the  two  major  sources  of 
primary data only. Historical trading data from Yahoo 
Finance  with  market  information  validation  of  the 
total of 1,904 trading days from the year 2016 to 2025 
(Lin  et  al., 2018).  They  also  include 12  main  price 
variables,  volume  measures,  and  technical 
characteristics  features  selected  based  on  the 
recommendation of Nguyen and Lee (2017). 
2.2  Data Preprocessing and Feature 
Engineering 
All  the  necessary  source  data  are  obtained  from 
different sources and combined in order to create the 
database  that  consists  of  1904  trading  days  and 
contains  only  one  data  collection  format.  Yahoo 
Finance and Eastern’s finance were used to obtain the 
trading history of the company. In this case the total 
number of raw variables is for price metrics thirteen 
with  volume  indicators  consisting  of  total  value 
weight, quantities and for technical features the total 
is  thirteen.  The  missing  value  imputation,  outlier 
detection and series validation were made under the 
following specifications of robust data cleaning. 
The workflow is organized into five key stages: 
(1) Data Preprocessing, where raw stock price data is 
collected,  cleaned,  normalized,  and  structured  into 
time-series formats to ensure sequential consistency; 
(2)  Feature  Engineering,  which  extracts  relevant 
financial  indicators  and  technical  features  while 
employing  dimensionality  reduction  techniques  to 
enhance  computational  efficiency;  (3)  Model 
Training,  involving  the  training  of  Random  Forest 
(RF),  XGBoost,  and  Gradient  Boosting  Decision 
Trees  (GBDT)  models  on  historical  data,  with 
hyperparameter  tuning  to  optimize  predictive 
performance;  (4)  Performance  Evaluation,  where 
models are assessed using metrics such as R², mean 
absolute  error  (MAE),  and  trend  stability  (S), 
alongside  volatility  tracking  and  error  distribution 
analysis  to  determine  reliability;  and  (5)  Predictive 
Analysis,  where  the  best-performing  model  (RF)  is 
applied  to  generate  short-term  and  long-term 
forecasts, with trend-capture rate analysis confirming 
its robustness. Other than that, this structure ensures 
the integrity of the data and at the same time provides 
more  accurate  order  of  magnitude  predictions  by 
orders of magnitude, and more stability of the orders 
of  magnitude  with  the  majority  of  the  financial 
forecasting applications. 
2.3  Feature Construction 
In terms of feature engineering, there are three main 
components:  technical  analysis,  fundamental 
indicators,  and  temporal  features.  Its  technical 
features comprise a conventional price money metric 
together with sectoral abnormalities. It contributes to 
advancing  the  state-of-the-art  in  the  integration  of 
supply  chain  dynamics,  knowledge  spillovers,  and 
supply chain performance with electricity-generating 
sources and power transmission.