
Figure 4: Heatmap result of the Simulation Experiment.
Discussion: The simulation test demonstrated that
our web-based application scales to seven simultane-
ously moving subjects without degradation. Despite
the persistent false-positive “floor face” in roughly
50% of the frames, facial recognition recall remained
perfect and overall classification remained robust
(F1=95.6%). Body tracking and heatmap generation
operatedn near flawlessly (F1=97.3%), confirming the
system’s reliability under high-density, multi-person
scenarios.
6 CONCLUSIONS
This work presents a web-based application for fash-
ion retail integrating crowd detection and emotion
analysis using YOLO-based body/face detection and
CNN emotion recognition. The system generates
heatmaps of customer movement and quantifies emo-
tional states. Simulated experiments with seven ac-
tors achieved strong performance: facial recogni-
tion (91.6% precision, 100% recall, 95.6% F1), body
tracking (99.5% precision, 95.1% recall, 97.3% F1),
and emotion detection (98.5% F1 for ”Happy,” 94.0%
for ”Sad,” 91.0% for ”Neutral”). Results confirm re-
liable positive emotion detection in retail contexts.
Repository video analysis revealed promotional
footage predominantly shows ”Happy” expressions,
which may not reflect genuine customer reactions.
In contrast, simulations demonstrated customers are
typically neutral or negative in authentic interac-
tions, making ”Happy” detections more significant.
Heatmaps consistently identified high-density zones
across both data sources, validating their utility for
optimizing store layouts and product placement.
Key limitations include reliance on
staged/simulated videos preventing evaluation in
uncontrolled retail environments, and the need for
further validation of environmental factors (lighting,
camera angles) in operational stores.
6.1 Lessons Learned
Simulated environments, while useful for validation,
lack the complexity of real-world retail scenarios in-
cluding spontaneous customer behavior. Negative
emotions proved challenging to detect due to sub-
tle facial expressions, indicating a need for enriched
training data.
Future work should prioritize real-world deploy-
ment with retail partners to collect authentic cus-
tomer data (with strict privacy safeguards), incorpo-
rate audio cues for deeper satisfaction insights, and
develop real-time analytics for dynamic store man-
agement. This research demonstrates the feasibility
of integrated crowd and emotion analysis for fash-
ion retail, providing objective data to drive decisions
in store layout optimization, resource allocation, and
marketing strategies. With continued development
and validation, this approach can become essential for
fashion retailers seeking to enhance customer satis-
faction and competitive advantage.
REFERENCES
Babar, M. J., Husnain, M., Missen, M. M. S., Samad, A.,
Nasir, M., and Khan, A. K. N. (2023). Crowd count-
ing and density estimation using deep network-a com-
prehensive survey. TechRxiv.
Batch, A., Ji, Y., Fan, M., Zhao, J., and Elmqvist, N.
(2023). uxsense: Supporting user experience anal-
ysis with visualization and computer vision. IEEE
Transactions on Visualization and Computer Graph-
ics, 29(5):1923–1936.
Cao, Z., Lyu, L., Qi, R., and Wang, J. (2024). Crowdunet:
Segmentation assisted u-shaped crowd counting net-
work. Neurocomputing, 601:128215.
Fiedler, F. (1967). A theory of leadership effectiveness.
Journal of the Academy of Marketing Science, pages
33–44.
Ganesan, S. (2023). Deep learning model for identification
of customers satisfaction in business. Journal of Au-
tonomous Intelligence, 7(1).
Grewal, D., Baker, J., Levy, M., and Voss, G. (2003). The
effects of wait expectations and store atmosphere eval-
uations on patronage intentions in service-intensive
retail stores. Journal of Retailing, 79(4):259–268.
Guo, Y., Liu, Y., Oerlemans, A., Lao, S., Wu, S., and Lew,
M. S. (2016). Deep learning for visual understanding:
A review. Neurocomputing, 187:27–48.
J
¨
ahne, B., Haussecker, H., and Geißler, P. (1999). Hand-
book of Computer Vision and Applications. Academic
Press.
Kelleher, J. D. (2019). Deep Learning. MIT Press Essential
Knowledge series. MIT Press, Cambridge, MA.
Kitchenham, B. and Charters, S. (2007). Guidelines for
performing systematic literature reviews in software
WEBIST 2025 - 21st International Conference on Web Information Systems and Technologies
372