Exploring the Impact of Key Factors on the Accuracy of a Keras Machine Learning Model for Text Classification
Xiwen Jiang
2024
Abstract
Text classification and emotion classification are crucial components of modern machine learning applications, particularly on social media platforms. These classifications enable efficient information organization, sentiment analysis, and user targeting, essential for advertisements and content creation. Leveraging Tensorflow Keras models, this research explores how varying key parameters—epochs and batch size—affect model accuracy when applied to a large dataset sourced from Reddit, comprising over 3.8 million posts. Utilizing TextBlob, labels were generated for the dataset, transforming an unsupervised problem into a supervised one. The model architecture consists of four layers: embedding, GlobalAveragePooling1D, dense with ReLU activation, and dense with sigmoid activation, targeting binary text classification. The study investigates three different epoch values (10, 20, and 30) and batch sizes (32, 64, and 128), running each experiment multiple times with different random seeds to ensure robustness. Findings indicate that increasing the number of epochs generally improves accuracy, although diminishing returns and potential overfitting occur with excessive epochs. Meanwhile, batch size plays a more complex role, as larger batches can hinder the model's ability to capture detailed patterns. The results highlight the trade-offs between computational efficiency and prediction performance, providing practical insights for optimizing machine learning models in text classification tasks.
DownloadPaper Citation
in Harvard Style
Jiang X. (2024). Exploring the Impact of Key Factors on the Accuracy of a Keras Machine Learning Model for Text Classification. In Proceedings of the 1st International Conference on Modern Logistics and Supply Chain Management - Volume 1: MLSCM; ISBN 978-989-758-738-2, SciTePress, pages 368-371. DOI: 10.5220/0013331400004558
in Bibtex Style
@conference{mlscm24,
author={Xiwen Jiang},
title={Exploring the Impact of Key Factors on the Accuracy of a Keras Machine Learning Model for Text Classification},
booktitle={Proceedings of the 1st International Conference on Modern Logistics and Supply Chain Management - Volume 1: MLSCM},
year={2024},
pages={368-371},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013331400004558},
isbn={978-989-758-738-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 1st International Conference on Modern Logistics and Supply Chain Management - Volume 1: MLSCM
TI - Exploring the Impact of Key Factors on the Accuracy of a Keras Machine Learning Model for Text Classification
SN - 978-989-758-738-2
AU - Jiang X.
PY - 2024
SP - 368
EP - 371
DO - 10.5220/0013331400004558
PB - SciTePress