average, or sum pooling are applied to a portion of the
feature map, less significant aspects are removed
while the most crucial ones are retained. This
reduction in dimensionality also helps to reduce the
number of parameters and processing load in the
subsequent layers of the network.
A CNN's completely linked layer is essential to its
latter phases. In order to carry out tasks like
classification or regression, the completely
interconnected layers come together and synthesize
the collected features after the convolutional and
pooling layers have processed the input. Complex
decision-making processes are made possible in these
layers due to the fact that all of the neurons in the
layer above are connected to one another. When it
comes to categorization jobs' output layer, activation
functions such as SoftMax are frequently employed
because they transform the raw output scores of the
network into probabilities that add up to one,
signifying the chance of each class.
Backpropagation, a crucial technique for training
neural networks, is also used at this layer to modify
the weights and biases in response to variations in the
expected and actual outputs.
In summary, the convolution operation in CNNs
captures local patterns within the input, ReLU
introduces non-linearity for complex function
approximation, pooling reduces the spatial
dimensions and computational requirements, and
fully linked layers combine the qualities needed for
advanced jobs like categorization. Together, these
operations form the backbone of CNNs, enabling
them to effectively process and learn from image data.
2.2 LeNet Architecture
One of the first CNNs to be widely used was LeNet,
which opened the door for other studies on CNNs and
multilayer perceptrons (LeCun, 1998). Yann LeCun's
groundbreaking creation, LeNet5, is the outcome of
several fruitful iterations since 1988. LeNet was
mainly created for character recognition applications,
such postal codes and digits. activities involving
character recognition, such zip codes. Since then,
every newly suggested neural network design has had
its accuracy evaluated using the Mixed National
Institute of Standards and Technology (MNIST)
dataset, which has been established and used as a
standard.
3 EXPERIMENT AND RESULTS
3.1 Dataset
According to Ayush, while it has a different syntax
from English, American Sign Language (ASL) is a
complete natural language with many of the same
linguistic properties as spoken language (Ayush,
2019). Hand and facial motions are used in ASL
communication. It is the primary language of many
deaf and hard of hearing North Americans, in addition
to being spoken by many others who can hear
normally. The dataset's format closely resembles that
of traditional MNIST. For every letter A through Z, a
label (0–25) is represented in each training and test
case, acting as a one–to–one mapping (Neither 9=J
nor 25=Z has ever occurred because of gestural
movement). The training data (27,455 examples) and
test data (7,172 cases) are almost half the size of
typical MNIST but otherwise comparable to standard
MNIST. The header rows are designated pixel1,
pixel2, ..., pixel784, representing a 28x28 pixel image
with grayscale values between 0-255. The raw
gesture image data represents several people
repeating gestures in different conditions. A
remarkable expansion of a small batch of color
photographs (1704) that were not cropped around the
hand region of interest is where the MNIST data for
sign language originates.
This work augments the data by resizing,
grayscale scaling, cropping to the hand, and then
constructing more than 50 versions to increase the
number.
The idea of brightness adjustment is to change the
amount of light in a picture. This is achieved by
altering the pixel values, which range from 0 (black)
to 255 (white). By applying a luminance factor to
these values, the image can be made brighter or
darker. Increasing the factor lightens the image, while
decreasing it darkens it. This process is crucial for
enhancing image recognition in machine learning
models, as it can significantly impact the performance
of algorithms like CNNs.
3.2 Performance Comparison
In this work, as demonstrated in Table 1, by
modifying the brightness of the dataset, the author
gets the following results.
From this result it could be observed that the
accuracy of the model for recognizing images
increases as the brightness increases. The model at
higher and lower brightness decreases the recognition
of the model considerably. Since the method value