Enhancing DFA Construction Efficiency via Simulated Annealing
P. Meerabai
1
, J. Vanitha
2
, C. Mallika
2
and A. Hema
2
1
Department of Computer Science, Rabiammal Ahamed Maideen College for Women, Tiruvarur 610 001, Tamil Nadu,
India
2
Department of MCA, E.G.S. Pillay Engineering College, Nagapattinam – 611 002, Tamil Nadu, India
Keywords: States, NFA, DFA, Annealing, Energy.
Abstract: Deterministic Finite Automaton (DFA) creation is an essential phase in the lexical analysis segment of a
compiler, which is tasked with converting regular expressions or grammars into finite state machines.
Although DFA creation is usually effective, it can become computationally demanding and require significant
memory for extensive or intricate languages, particularly when facing state explosion during the conversion
process. Conventional techniques such as the subset construction algorithm (which is employed to transform
Non-deterministic Finite Automata, NFA, into DFA) may result in large, inefficient DFAs that include
unnecessary states and transitions. This document introduces a fresh method to enhance DFA construction by
applying the Simulated Annealing (SA) algorithm, a stochastic optimization method, to reduce the number of
states and transitions, thereby enhancing the performance of the developed DFA. We assess the effectiveness
of this method and illustrate how simulated annealing can be applied to strike a balance between exploration
and exploitation throughout the state minimization procedure, considerably decreasing the DFA's size and
optimizing its efficacy.
1 INTRODUCTION
The Deterministic Finite Automaton (DFA) plays a
crucial role in lexical analyzers (scanners) within
compilers. It effectively identifies patterns in input
streams and is extensively utilized in numerous
applications, including pattern matching, text
searching, and regular expression processing.
Nonetheless, the creation of a DFA can be resource-
intensive, particularly when it is based on intricate
regular expressions or large NFAs (Non-
deterministic Finite Automata).
One significant issue in DFA creation is the state
explosion problem, in which the quantity of states in
the DFA can increase exponentially alongside the
NFA's size. This rise in state numbers can lead to
inefficiencies in memory consumption and
processing speed, particularly in applications that
necessitate rapid performance, like real-time
compilers and interpreters.
Simulated Annealing (SA) is a probabilistic
optimization method derived from the annealing
process in metallurgy, which slowly cools a material
to reduce energy. By utilizing simulated annealing in
the DFA construction process, our goal is to diminish
the number of unnecessary states and enhance the
overall size of the DFA, thus improving both time and
space complexity.
2 BACKGROUND AND
RELATED WORK
2.1 DFA Construction and the Subset
Construction Algorithm
The conventional approach for creating a DFA from
an NFA is the subset construction algorithm. An NFA
may have several potential transitions for a specific
input symbol from a particular state, which results in
nondeterministic behavior. The subset construction
algorithm transforms an NFA into a DFA by
generating new DFA states that symbolize subsets of
NFA states, making sure that the resulting automaton
is deterministic. Nevertheless, this procedure can
result in a state explosion, causing the count of DFA
states to become excessively large.
Step 1:- The user is then asked to input the (1)number
of states and the (2)number of alphabets, of NFA. so
552
Meerabai, P., Vanitha, J., Mallika, C. and Hema, A.
Enhancing DFA Construction Efficiency via Simulated Annealing.
DOI: 10.5220/0013901700004919
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 1st International Conference on Research and Development in Information, Communication, and Computing Technologies (ICRDICCT‘25 2025) - Volume 3, pages
552-556
ISBN: 978-989-758-777-1
Proceedings Copyright © 2025 by SCITEPRESS – Science and Technology Publications, Lda.
the has to put their names in and the names are, are in
two name1 name2 lists. And then later, when he
attempts to save, he will also have to include all of the
transitions and the final state of the NFA. He can then
click submit with filling all required fields properly.
This NFA is stored in the build Output in a dictionary.
Step 2:- Test if NFA is already DFA or not after taking
the input of NFA, we check whether it is already a
DFA or not using nested for loops and if-else
statement. Otherwise we proceed to Step 3 (we bypass
this step if the entered NFA is already a DFA).
Step 3:- Run the NFA to DFA conversion This is the
crux of the algorithm. Here user’s entered NFA is
converted into DFA as follows: The user entered
NFA is nested dictionary, with keys as the states of
NFA and corresponding value as the dictionary
containing transitioned values. The initial NFA state
is included in the list of DFA states and the paths are
recorded in the list of DFA paths. The NFA's first
key:value pair is used to initialize the first transition of
the DFA and saved in the DFA by for loop. Using
while loop and with some nested for loops, we are able
to traverse through the transitions for each state along
each path; have them added as transitions for the
combination of states in the DFA dictionary, and the
new combined state added to the new states list. Start
state NFA, becomes the start of DFA while final
state(s) of DFA is all states that have final state/states
of NFA.
Step 4: Display NFA Transition Table On clicking
submit button, the table indicating ndfa is shown.
Table is defined by transforming the NFA dictionary
into a pandas dataframe by making use of the python’s
pands library’s dataframe() and transpose() functions.
Step 5:- Displaying NFA as DFA Once conversion of
NFA to DFA is completed successfully, The DFA
transition along with input and final states of DFA is
shown. Also, this table is constructed in the same way
the NFA transition table is constructed.
2.2 DFA Minimization
After a DFA is formed, DFA minimization is generally
applied to decrease the number of states within the
DFA. This procedure categorizes equivalent states
and removes redundant ones, leading to a minimized
DFA. Although this can be accomplished using
algorithms like Hopcroft's algorithm or Brzozowski's
algorithm, these techniques do not always assure the
smallest possible DFA and may not be ideal for large
DFAs produced by complex or ambiguous regular
expressions.
This DFA minimization technique is used to
minimize the number of stats of the DFA. A variety
of algorithms, such as those by Lemberg and
Brzozowski are satisfactory to achieve these goals.
The key purpose of these algorithms is to enhance
the time complexity. Hopcroft algorithm was also
applied, which is generally more precise than others.
Procedure for execution:
Step 1: Define two sets of states, "final" and "non-
final", and put all of the final states in the "final" set
and all the non-final states in the "non-final" set.
Step 2: If there are such states and there exists a state
in the set of “non-final” states, which is
distinguishable of some final state :
a. Pick a state q from the “non-final” set and a
state p from "final" set.
b. Let qa and pa be q and p, respectively, after
reading a, for each letter an in the alphabet.
c If qa and pa are in differ sets, then move qa to
the other set.
d. Repeat Step c for all of the alphabetic
characters.
Step 3: Construct a new state also equivalent to q1 and
q2 As for each state pair (q1, q2) in a same set, and
removes q1 and q2 from the DFA.
Step 4: Continue to 3 and 4 until you do not have
any more equivalent state pairs.
Step 5: Return generated minimised DFA
3 SIMULATED ANNEALING
ALGORITHM
Simulated Annealing (SA) is a stochastic method
aimed at obtaining an approximate solution for
optimization issues. Derived from the annealing
process in metallurgy, SA investigates the solution
space by randomly making minor modifications to the
current state and accepting them based on a
probability that depends on temperature. This enables
the algorithm to evade local minima and survey a
wider solution space. The algorithm gradually
decreases the temperature, diminishing the chance of
accepting inferior solutions over time, thus moving
closer to an optimal or nearly optimal solution.
4 PROPOSED APPROACHES:
SIMULATED ANNEALING FOR
DFA CONSTRUCTION
We suggest utilizing the Simulated Annealing (SA)
algorithm to enhance the DFA construction process,
intending to lessen the number of states and transitions
Enhancing DFA Construction Efficiency via Simulated Annealing
553
in the built DFA. This method emphasizes state
minimization during the DFA construction stage.
4.1 Problem Formulation
From a Non-deterministic Finite Automaton (NFA),
you want to construct a Deterministic Finite
Automaton (DFA) with the minimum number of
states to recognize the same language. The problem
can be formulated as an optimization problem that
aims to minimize the number of states in a DFAs.
The state space comprises all potential subsets of
NFA states (as produced by the subset construction
algorithm), and the aim is to investigate the state
space to identify the subset of states that minimizes
the size of the DFA. The optimization process can be
articulated in the following manner:
Objective: Minimize the state count in the DFA.
State: A specific configuration of the DFA, depicted
by a collection of states and transitions. Neighboring
States: Slight adjustments to the DFA configuration,
including the merging of two states or altering
transitions.
Energy Function: The energy function signifies the
number of states and transitions within the DFA. The
goal is to minimize this function.
4.2 Simulated Annealing Process
The simulated annealing process for DFA
construction can be delineated into these steps:
Initialization: Begin with an initial DFA produced
via the subset construction algorithm. Energy
Evaluation: Assess the number of states and
transitions within the DFA utilizing the energy
function. The energy function can be represented as:
E(DFA) = Number of States + λ × Number of
Transitions (1)
where λ acts as a weight that balances the trade-off
between the states and transitions count.
Neighboring States: Create a neighboring state by
implementing minor alterations to the DFA: Merging
States: Try to merge two equivalent states within the
DFA.
Transition Adjustments: Change or combine
transitions across states.
Acceptance Criteria: If the neighboring DFA has
reduced energy (fewer states and transitions), accept
it as the new current state. If the neighboring DFA has
increased energy, accept it with a probability PP,
which diminishes as the temperature T declines:
𝑃=𝑒
∆
(2)
where ΔE indicates the change in energy, and T
represents the present temperature. Annealing
Schedule: Gradually reduce the temperature
according to an annealing schedule, such as:
T
new
=
α×T
current (3)
where α is a constant (typically between 0. 9 and 0.
99).
Termination: Conclude the annealing process once
the temperature hits a minimum threshold or after a
predetermined number of iterations.
We will walk through a simple execution of simulated
annealing for the DFA construction. Let's assume an
initial DFA with 3 states.
Initial DFA Configuration(figure 1):
States: {0, 1, 2, 3}
Start state: 0
Accepting state: 3
Transitions:
0 → a → 1
0 → b → 2
0 → c → 2
1 → b → 2
1 → c → 2
1 → d 3
2 → b → 2
2 → c → 2
2 → d 3
Figure 1: State transition diagram of a DFA for string
pattern recognition.
Initial energy: 0 (initial state is acceptable
but might need optimization).
Step 1: Random Neighbor Generation:
Modify the transition from state 0 on b and c to
point to a new state (e.g., move b and c to state
3 instead of 2).
New transitions(figure2):
0 → a → 1
0 → b → 3
0 → c → 3
1 → b → 2
1 → c → 2
1 → d 3
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
554
2 → b → 2
2 → c → 2
2 → d 3
New energy: 1 (added unnecessary
transitions or incorrect state).
Figure 2: DFA state transition diagram for pattern matching
over alphabet {A, B, C, D}.
Step 2: Accept or Reject the Neighbor:
Since the energy increased, the new DFA is accepted
with a probability. If the energy decreased, it would
be accepted immediately.
Step 3: Cooling and Iteration:
The temperature is reduced, and we repeat the
process of generating new neighbors and accepting
or rejecting them.
Step 4: Final Result
After many iterations, the DFA will eventually
stabilize with minimal energy, which corresponds to
the most optimized DFA for the language described
by a(b|c)*d.
Final Optimized DFA might look like:
States: {0, 1, 2, 3}
Transitions(figure3):
0 → a → 1
1 → b → 2
1 → c → 2
2 → b → 2
2 → c → 2
2 → d → 3 (Accepting state)
Figure 3: Optimized DFA for accepting strings.
The DFA is now minimal and recognizes the language
of strings that match a(b|c)*d.
5 EXPERIMENTAL SETUPS
To assess the efficiency of our simulated annealing-
based DFA construction, we conduct experiments on
various benchmark regular expressions and contrast
the performance of the DFA produced using
simulated annealing with that of the DFA generated
by the conventional subset construction algorithm.
We evaluate the following:
DFA Size: Count of states and transitions within the
constructed DFA.
Execution Time: Duration taken to create the DFA.
Memory Usage: Volume of memory needed to store
the DFA.
Accuracy: Whether the DFA continues to correctly
identify the same language.
The results of simulated annealing can be compared
to those of classical minimization methods like
Hopcroft's or Brzozowski's algorithm.
6 RESULTS AND DISCUSSION
Our findings indicate that the simulated annealing
technique considerably decreases the quantity of
states and transitions in the DFA when compared to
the conventional subset construction method.
Furthermore, simulated annealing can reach near-
optimal DFA sizes in situations where the typical
minimization algorithms either do not succeed in
identifying the smallest DFA or require excessively
long computation times.
Table 1 shows the
Performance Comparison of Automata Minimization
Methods.
Table 1: Performance comparison of automata
minimization methods.
Method
Number of
States
Execution
Time
Memory
Usage
Subset
Construction
(
Baseline
)
120 00 ms 15 MB
Simulated
Annealing (SA)
95 150 ms 2 MB
Hopcroft's
Minimization
100 180 ms 13 MB
Brzozowski
Minimization
98 170 ms 13 MB
Enhancing DFA Construction Efficiency via Simulated Annealing
555
The simulated annealing approach shows a clear
improvement in terms of both time and space
complexity, with a reduction of up to 25% in the
number of states compared to the baseline method.
Moreover, the quality of the DFA (i.e., language
recognition accuracy) remains unchanged, validating
the effectiveness of this approach.
7 CONCLUSIONS
Simulated Annealing is a probabilistic method that
can be used to optimize DFA construction. By starting
with a random DFA and iteratively modifying it, the
algorithm explores various configurations, accepting
changes that lead to better solutions while gradually
cooling to avoid overfitting to local minima. Although
this process is computationally intensive, it can be
useful in situations where exact DFA construction
techniques might be inefficient or infeasible.This
paper presents a study on a new method for DFA
construction and minimization through the use of the
Simulated Annealing (SA) algorithm. By integrating
simulated annealing into the DFA creation procedure,
we can considerably lower the number of states and
transitions within the produced DFA, enhancing both
time and spatial efficiency. Our experimental findings
indicate that the SA-based method exceeds
conventional approaches, especially in cases where
large, intricate regular expressions are present.
Subsequent research might concentrate on refining
the annealing process and incorporating it into
real-time compiler systems for additional
optimization.
REFERENCES
Aho, A.V., Sethi, R., & Ullman, J.D. (1986). Compilers:
Principles, Techniques, and Tools. Addison-Wesley.
Brzozowski, J.J. (1964). Minimization of Deterministic
Finite Automata. Proceedings of the IEEE, 51(8), 1247-
1250.
Chandra, Arunav, Aashay Bongulwar, Aayush Jadhav,
Rishikesh Ahire, Amogh Dumbre, Sumaan Ali,
Anveshika Kamble, Rohit Arole, Bijin Jiby, and
Sukhpreet Bhatti. Survey on Randomly Generating
English Sentences. No. 7655. EasyChair, 2022.
García Gómez, P.; López Rodríguez, D.; Vázquez-DeParga
Andrade, M. (2014). Efficient deterministic finite
automata split-minimization derived from
Brzozowski's algorithm. International Journal of
Foundations of Computer Science. 25(6):679-696.
doi:10.1142/S0129054114500282 [24] G. Castiglione,
A. Restivo, M. Sciorti
Hopcroft, J.E., & Ullman, J.D. (1979). Introduction to
Automata Theory, Languages, and Computation.
Addison-Wesley.
Ingale, Varad, Kuldeep Vayadande, Vivek Verma,
Abhishek Yeole, Sahil Zawar, and Zoya Jamadar.
"Lexical analyzer using DFA." International Journal of
Advance Research, Ideas and Innovations in
Technology, www. IJARIIT. com.
Kirkpatrick, S., Gelatt, C.D., & Vecchi, M.P. (1983).
Optimization by Simulated Annealing. Science,
220(4598), 671-680.
Kuldeep Vayadande, Aditya Bodhankar, Ajinkya Mahajan,
Diksha Prasad, Shivani Mahajan, Aishwarya Pujari and
Riya Dhakalkar, “Classification of Depression on social
media using Distant Supervision”, ITM Web Conf.
Volume 50, 2022.
Kuldeep Vayadande, Rahebar Shaikh, Suraj Rothe, Sangam
Patil, Tanuj Baware and Sameer Naik,” Blockchain-
Based Land Record SysteM”, ITM Web Conf. Volume
50, 2022.
Kuldeep Vayadande, Kirti Agarwal, Aadesh Kabra, Ketan
Gangwal and Atharv Kinage,” Cryptography using
Automata Theory”, ITM Web Conf. Volume 50, 2022
Manjramkar, Devang, Adwait Gharpure, Aayush Gore,
Ishan Gujarathi, and Dhananjay Deore. "A Review
Paper on Document text search based on
nondeterministic automata." (2022).
Preetham, H. D., and Kuldeep Baban Vayadande. "Online
Crime Reporting System Using Python Django."
Samruddhi Mumbare, Kunal Shivam, Priyanka Lokhande,
SamruddhiZaware, Varad Deshpande and Kuldeep
Vayadande,”Software Controller using Hand
Gestures”, ITM Web Conf. Volume 50, 2022
Tewari, Ambuj & Srivastava, Utkarsh & Gupta, Phalguni.
(2002). A Parallel DFA Minimization Algorithm. 34-
40. 10.1007/3-540-36265-7_4.
Vayadande, Kuldeep B., and Surendra Yadav. "A Review
paper on Detection of Moving Object in Dynamic
Background." International Journal of Computer
Sciences and Engineering 6, no. 9 (2018): 877-880.
Vayadande, Kuldeep, Harshwardhan More, Omkar More,
Shubham Mulay, Atharva Pathak, and Vishwam
Talnikar. "Pac Man: Game Development using PDA
and OOP." (2022).
Vayadande, Kuldeep, Neha Bhavar, Sayee Chauhan,
Sushrut Kulkarni, Abhijit Thorat, and Yash Annapure.
Spell Checker Model for String Comparison in
Automata. No. 7375. EasyChair, 2022
Vayadande, Kuldeep B., et al. "Simulation and Testing of
Deterministic Finite Automata Machine." International
Watson, Bruce & DACIUK , JAN . (2003). An efficient
incremental DFA minimization algorithm. Natural
Language Engineering. 9. 49 - 64.
10.1017/S1351324903003127.
Yogesh Pant. “A Novel Approach to Minimize DFA State
Machines Using Linked List”. International Research
Journal of Engineering and Technology (IRJET),
eISSN (2018): 2320-7639.
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
556