A Big Data Analytics System for Predicting Suicidal Ideation in Real-Time Based on Social Media Streaming Data

Mohamed A. Allayla, Mohamed A. Allayla, Serkan Ayvaz, Serkan Ayvaz

2025

Abstract

Online social media platforms have recently become integral to our society and daily routines. Every day, users worldwide spend a couple of hours on such platforms, expressing their sentiments and emotional state and contacting each other. Analyzing such huge amounts of data from these platforms can provide a clear insight into public sentiments and help detect their mental status. The early identification of these health condition risks may assist in preventing or reducing the number of suicide ideation and potentially saving people’s lives. The traditional techniques have become ineffective in processing such streams and large-scale datasets. Therefore, the paper proposed a new methodology based on a big data architecture to predict suicidal ideation from social media content. The proposed approach provides a practical analysis of social media data in two phases: batch processing and real-time streaming prediction. The batch dataset was collected from the Reddit forum and used for model building and training, while streaming big data was extracted using Twitter streaming API and used for real-time prediction. After the raw data was preprocessed, the extracted features were fed to multiple Apache Spark ML classifiers: NB, LR, LinearSVC, DT, RF, and MLP. We conducted various experiments using various feature-extraction techniques with different testing scenarios. The experimental results of the batch processing phase showed that the features extracted of (Unigram + Bigram) + CV-IDF with MLP classifier provided high performance for classifying suicidal ideation, with an accuracy of 93.47%, and then applied for real-time streaming prediction phase.

Download


Paper Citation


in Harvard Style

Allayla M. and Ayvaz S. (2025). A Big Data Analytics System for Predicting Suicidal Ideation in Real-Time Based on Social Media Streaming Data. In Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA; ISBN 978-989-758-758-0, SciTePress, pages 132-143. DOI: 10.5220/0013567800003967


in Bibtex Style

@conference{data25,
author={Mohamed Allayla and Serkan Ayvaz},
title={A Big Data Analytics System for Predicting Suicidal Ideation in Real-Time Based on Social Media Streaming Data},
booktitle={Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA},
year={2025},
pages={132-143},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013567800003967},
isbn={978-989-758-758-0},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA
TI - A Big Data Analytics System for Predicting Suicidal Ideation in Real-Time Based on Social Media Streaming Data
SN - 978-989-758-758-0
AU - Allayla M.
AU - Ayvaz S.
PY - 2025
SP - 132
EP - 143
DO - 10.5220/0013567800003967
PB - SciTePress