to higher error (27.21%). MLR employs a lag-1
feature plus one-hot socio-demographics; it captures
level differences but remains above the best
ARIMAX (23.40% vs. 13.21%). Figures 7–8 display
the MLR and SES one-step predictions relative to
actual counts, and Table 3 summarizes single-step
accuracy across all contenders. These patterns match
prior reports where exponential smoothing is a strong
low-complexity benchmark but is surpassed when
well-aligned exogenous structure is available (Chen,
2022; James & Weese, 2022).
Figure 9: ARIMAX forecasting using Mother’s Education
as External Variable (MAPE : 13.21%).
Comparisons with related studies reinforce these
conclusions. In applied forecasting benchmarks (e.g.,
M5), models that exploit exogenous information and
cross-series structure tend to outperform purely
univariate baselines—supporting our ARIMAX
findings and the use of AIC/BIC for parsimony
(Makridakis et al., 2022; special issue overview).
Figure 1 and Figure 2 situate our series characteristics
and ARIMA baseline, respectively, within the Box-
Jenkins framework (ARIMA identification,
diagnostics) (Box et al., 2015; Hyndman &
Athanasopoulos, 2021). For education planning,
demography-linked approaches (e.g., STEP)
emphasize coherent transitions and cohort structure,
which we identify as future exogenous candidates to
test alongside policy and macro variables (Xiang et al.,
2023; AIR Professional File, 2021).
Despite these contributions, limitations remain.
The exogenous set is restricted to socio-demographics
observed at application; macroeconomic indicators,
tuition/fee schedules, and scholarship budgets were
unavailable. The short history (11 semesters)
constrains the power of multi-lag analyses and the
exploration of seasonal dynamics. Future work should
expand the horizon and exogenous sources, and
benchmark hybrid/ensemble approaches (e.g.,
ARIMA-LSTM) while keeping administrative
interpretability (Jain et al., 2024; Wang et al., 2024).
Figure 10 (Appendix) may report residual diagnostics
(Ljung-Box) to accompany Table 2 comparisons and
ensure no remaining autocorrelation (Hyndman &
Athanasopoulos, 2021).
REFERENCES
As’Ad, M., Wibowo, S. S., & Sophia, E. (2017).
Forecasting student enrolment with ARIMA. Jurnal
Informatika Merdeka Pasuruan, 2(3).
Fang, X., Zhang, Q. and Wu, Y. (2017). Student enrolment
prediction model based on data mining. IEEE
International Conference on Computational Science
and Engineering.
Hyndman, R. J. and Athanasopoulos, G. (2021).
Forecasting: Principles and Practice (3rd ed.). OTexts.
Box, G. E. P., Jenkins, G. M., Reinsel, G. C. and Ljung, G.
M. (2015). Time Series Analysis: Forecasting and
Control (5th ed.). Wiley.
Makridakis, S., Spiliotis, E. and Assimakopoulos, V.
(2022). M5 accuracy competition: Results, findings,
and conclusions. International Journal of Forecasting,
38(4), 1346–1364.
Makridakis, S., Spiliotis, E. and Assimakopoulos, V.
(2022). M5 competition special issue: Background and
organization. International Journal of Forecasting,
38(4).
Chen, Q. (2022). A comparative study on the forecast
models of the enrollment proportion of general and
vocational education. International Education Studies,
15(6), 109–126.
James, F. and Weese, J. (2022). Neural network-based
forecasting of student enrollment with exponential
smoothing baseline. ASEE Annual Conference &
Exposition.
Xiang, L. et al. (2023). The School Transition Estimation
and Projection (STEP) model. Population, Space and
Place, 29(8), e2681.
Bowman, R. A. (2021). Student trajectories for enrollment
forecasting, management, and planning (AIR
Professional File No. 153). Association for Institutional
Research.
Statsmodels documentation. (2024). Simple exponential
smoothing (SES) and ETS. Statsmodels.
Jain, S. et al. (2024). A novel ensemble ARIMA–LSTM
approach for time-series forecasting. PLOS ONE,
19(6), e0303103.
Wang, B. et al. (2024). ARIMA–LSTM for time-series
prediction (methodology article). BMC / PMC article.
Deogratias, E. (2024). Forecasting students’ enrolment in
Tanzania government primary schools (2021–2035).
International Journal of Computing and Informatics.
Loder, A. K. F. (2025). Predicting the number of “active”
students for funding management. Journal of Student
Financial Aid.
Little, R. J. A. and Rubin, D. B. (2019). Statistical Analysis
with Missing Data (3rd ed.). Wiley.
Schafer, J. L. and Graham, J. W. (2002). Missing data: Our
view of the state of the art. Psychological Methods,
7(2), 147–177.