learning by Vision Transformers (ViT) for dental
diagnosis with improved feature extraction
performance in the detection of cavities
Xue, Z., et al.
(2022). Sharma and Iyer introduced an attention-based
deep learning approach to increase the interpretability
of results in cavity detection, demonstrating that AI-
based models are able to reduce false positives and
negatives to a larger extent in radiographic
examination results
Jiang, H. (2023).
Hybrid machine learning techniques have also been
explored in the prediction of dental disease. Raj and
Mehta compared the impact of applying CNN with
traditional classifiers such as Random Forest and
XGBoost, representing a strength of ensemble models
towards increasing diagnostic consistency
Sulochana,
C., & Sumathi, M. (2024).
Patel et al. compared
different ML models, i.e., SVM, Decision Trees, and
Naïve Bayes, to determine the best approach for
automatic detection of cavities (
Shariff et al., 2024).
Their findings indicated that deep learning-based
approaches, if employed together with the general
classifiers, provide better performance in identifying
early-stage cavities. Even though these innovations
create strong impressions, problems such as small
annotated data, inconsistency in image quality, and
generalization persist as obstacles to dental diagnosis
using AI. Adaptive models, hyperparameter
optimization environments, and enhanced feature
selection techniques need to be integrated for further
boosting cavity detection and prediction accuracy.
Thus, for this research work, a mixed ML strategy
consisting of CNNs, U-Net, and Vision Transformers
and ensemble methods is suggested for establishing a
strong and clinically applicable diagnosis system.
3 PROPOSED METHODOLOGY
The proposed method utilizes deep learning models to
search dental radiographic images for both automatic
cavity detection and prediction. Developed for use in
a clinical support system, the model searches dental
X-rays to highlight areas of cavity damage accurately.
Utilizing advanced image segmentation methods,
such as U-Net and Grad-CAM, the system highlights
potential cavities and predicts their growth based on
historic patient data. After sensing the early signs of
degradation, the system provides real-time diagnostic
feedback to aid dentists in making correct treatment
decisions. Future enhancements include cloud-based
connectivity and real-time AI-driven examination for
improved clinical productivity.
3.1 Data Collection
Dental radiography images employed within this
study were gathered from the clinical sources,
including dental clinics, hospitals, and public data
bases
Xing, W., et al. (2024). The database contains
different types of images reflecting different dental
pathologies, grades of cavities, and resolution to
support rigorous analysis. Dental conditions
represented within the images span from early-stage
cavities to deep cavities, enamel decays, and other
abnormalities for a balanced set of training and testing
Shamim, Z. M., et al. (2020).
All the patient information was anonymized
rigorously prior to processing to maintain ethical
standards, ensuring privacy policy compliance and
avoiding any possible identification of patients
Welikala, R. A., et al. (2020). The table 1 shows
Distribution of Dental Radiographic Images by
Condition and Resolution. dataset was properly
selected to include high-quality radiographs but reject
low-resolution or blurry ones in order to boost model
performance. Moreover, images of patients belonging
to various ethnicities and age groups were used to
enhance the generalization ability of the model
Welikala, R. A., et al. (2020).
Table 1: Distribution of Dental Radiographic Images by
Condition and Resolution.
Class No of Images Image Resolution
Healthy Teeth 1,200 1024×1024
Early Cavities 950 1024×1024
Deep Cavities 850 1024×1024
3.2 Image Preprocessing
To improve the visibility and utilization of dental
radiographs for processing in AI, certain
preprocessing operations were performed. Noise
reduction was carried out by utilizing the aid of
Gaussian and median filtering in order to remove
artifacts and enhance image quality
Xing, W., et al.
(2024).
The figure 1 shows the Dataset preprocessing
and augmentation for U- Net 3+ training. To amplify
contrast, contrast manipulation strategies like
histogram equalization and CLAHE (Contrast
Limited Adaptive Histogram Equalization) were
utilized for enhancement of discrimination between
affected cavity regions and healthy teeth
Shamim, Z.
M., et al. (2020).
Segmentation was also done with the
help of a U-Net model that was capable of detecting