fields ranging from traffic forecasting to agricultural
yield modeling, demonstrating superior performance
in handling high-dimensional data and non-linear
patterns (Liu and Wu, 2017). In the smartphone
context, RF has been used to prioritize feature
importance, such as identifying camera resolution and
processor speed as key drivers of price variation
while mitigating overfitting risks (Smith et al., 2013).
Despite these advancements, several research
gaps persist. Most studies focus on single markets,
overlooking how consumer priorities for hardware
features differ across regions-for instance, price
sensitivity to RAM in emerging markets versus a
premium placed on camera quality in developed
economies (Hengl et al., 2018). Additionally, the
comparative utility of MLR and RF in a hybrid
modeling framework remains underexplored,
limiting insights into how linear and non-linear
approaches can complement each other. Existing
literature also often neglects to analyze interactions
between multi-faceted features (e.g., battery capacity
and screen size), which collectively influence pricing
strategies in ways that linear models cannot capture
(Grömping, 2009; Kalaivani et al., 2021).
This study addresses these gaps by systematically
comparing MLR and RF models using a
comprehensive dataset of 930 smartphone models
across five regions (China, USA, Pakistan, India,
Dubai). The research integrates MLR’s
interpretability with RF’s capability to handle
complex interactions, aiming to identify key drivers
of price variation, evaluate model performance in
capturing regional market nuances, and provide data-
driven guidance for feature optimization and pricing
strategies. By leveraging both methodologies, the
study bridges traditional econometric approaches
with modern machine learning to offer a more holistic
understanding of market dynamics.
The significance of this work lies in its dual
contributions to theory and practice. Theoretically, it
advances understanding of how hybrid modeling can
enhance predictive accuracy in technology markets,
where feature interactions and regional variations are
prevalent (Smith et al., 2013). Practically, the study
offers manufacturers actionable insights into regional
preferences-such as prioritizing camera upgrades in
premium markets or optimizing battery capacity in
cost-sensitive regions-using metrics like Root Mean
Squared Error (RMSE) to validate model robustness
(Wang et al., 2018). Its novelty resides in the
integration of multi-regional data, systematic
comparison of MLR and RF, and focus on feature
interactions, which have been understudied in prior
research (Speiser et al., 2019).
Guided by these objectives, the study addresses
three key research questions: What linear
relationships exist between hardware features and
prices in diverse markets? How effectively can RF
models capture non-linear patterns and regional
nuances? And which model demonstrates superior
generalizability across different market conditions?
To answer these, the research employs MLR to assess
linear associations and RF to model non-linear
interactions, using an 80% training–20% test dataset
split and feature importance analysis. This approach
rigorously evaluates both models’ strengths in
capturing market dynamics, from linear trends in
RAM and screen size to non-linear synergies between
camera quality and processor performance (Liu and
Wu, 2017).
These results not only provide data-driven support
for manufacturers to optimize regional pricing
strategies (e.g., emphasizing RAM cost-effectiveness
in emerging markets and camera innovation in
premium segments) but also offer methodological
references for academia by validating the synergistic
value of traditional econometrics and machine
learning in technology market analysis through
comparative metrics like R² and RMSE. Future
research could further expand to cross-annual
dynamic data to explore the impact of technology
iteration cycles on model stability, or incorporate
unstructured data such as consumer sentiment
analysis to more comprehensively reveal the driving
forces behind market trends. The analytical
framework established in this study is expected to
facilitate the smart hardware industry in forming a
closed loop of "data insight-strategy optimization-
market validation," enabling enterprises to achieve
precise product positioning and resource allocation in
rapidly evolving global competition.
2 METHODOLOPGY
2.1 Data Source
This paper found some data from Kaggle to explore
factors that influence phone prices. The dataset
involves 930 samples. The research concentrates on 6
factors: “Front camera”, “Back camera”, “Processor”,
“Mobile weight”, “Screen size” and “Battery
capacity”.
2.2 Multiple Linear Regression
The Multiple Linear Regression (MLR) model is used
in the predictive research to forecast the phone price.