Parallel algorithms for regular language recognition have been widely experimented for various types of finite automata (FA), deterministic (DFA) and nondeterministic (NFA), often derived from regular expressions (RE). Such algorithms cut the input string into finitely many sections or chunks, independently recognize each one in parallel by means of identical FAs, and at last join the results to check overall consistency. Each FA, except the first, needs to speculatively start a run assuming any state as initial, though most runs will abort or be rejected when joining the computations of adjacent chunks. The ensuing overhead may nullify the gain over a serial algorithm. Existing parallel DFA-based recognizers suffer from the excessive number of starting states, while the NFA-based ones suffer from the number of nondeterministic transitions. Our algorithm is based on a new type of multi-entry DFA called reduced interface DFA (RI-DFA), which can reduce speculation by cutting the number of initial states down to the size of an equivalent NFA. The algorithm benefits from the smaller size of an NFA combined with the speed of deterministic transitions. We have also developed a new initial-state reduction method for RI-DFA. A quantitative comparison of the number of starting states of RI-DFA vs DFA for a large public benchmark of complex FAs confirms the effectiveness of our approach. Speed measurements of text recognition on a multi-core computer show that RI-DFA matches the DFA- based recognizer on some benchmarks and is much faster on some others. The moderate time increase to convert NFA to RI-DFA does not hinder practical use.

Multi-entry DFA with reduced initial states to speedup parallel recognition

Luca Breveglieri;Stefano Crespi Reghizzi;Angelo Morzenti
2025-01-01

Abstract

Parallel algorithms for regular language recognition have been widely experimented for various types of finite automata (FA), deterministic (DFA) and nondeterministic (NFA), often derived from regular expressions (RE). Such algorithms cut the input string into finitely many sections or chunks, independently recognize each one in parallel by means of identical FAs, and at last join the results to check overall consistency. Each FA, except the first, needs to speculatively start a run assuming any state as initial, though most runs will abort or be rejected when joining the computations of adjacent chunks. The ensuing overhead may nullify the gain over a serial algorithm. Existing parallel DFA-based recognizers suffer from the excessive number of starting states, while the NFA-based ones suffer from the number of nondeterministic transitions. Our algorithm is based on a new type of multi-entry DFA called reduced interface DFA (RI-DFA), which can reduce speculation by cutting the number of initial states down to the size of an equivalent NFA. The algorithm benefits from the smaller size of an NFA combined with the speed of deterministic transitions. We have also developed a new initial-state reduction method for RI-DFA. A quantitative comparison of the number of starting states of RI-DFA vs DFA for a large public benchmark of complex FAs confirms the effectiveness of our approach. Speed measurements of text recognition on a multi-core computer show that RI-DFA matches the DFA- based recognizer on some benchmarks and is much faster on some others. The moderate time increase to convert NFA to RI-DFA does not hinder practical use.
2025
Implementation and Application of Automata
978-3-032-02601-9
978-3-032-02602-6
regular language, parallel recognition, finite automata, multi-entry DFA, speculation overhead, multi-core computer architecture
File in questo prodotto:
File Dimensione Formato  
main-CIAA25-final.pdf

Accesso riservato

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 257.5 kB
Formato Adobe PDF
257.5 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1297506
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact