RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

The context-free grammars extended with regular expressions (RE), known as ECF or EBNF grammars, are commonly used as they often allow for terser language definitions. Yet for such grammars the notion of syntax tree, and consequently of ambiguity, lacks an agreed definition. The simplified tree structures returned by the existing parsing algorithms do not faithfully represent all the ways a sentence is derivable by means of the REs present in the grammar rules. We contribute a precise definition of the regular parts in the structures of the EBNF syntax trees, which is aligned with the tree representations that have been adopted by the recent algorithms for RE matching. For an EBNF rule, the finest representation shows all the RE operators as nodes in the sub-tree, while the flat representation simply appends the string generated by the RE to the nonterminal node. A consequent notion of representation-dependent ambiguity follows. The above representations are incorporated into a Tomita-style parsing algorithm, i.e., a GLR(1) parser. To construct such a parser, we follow the positional method of the Berry-Sethi parsers for regular languages. Given an EBNF grammar, our parser-generator produces a parser compliant with the choices expressed by the user about the representation of the syntax trees. We also report the parsing performance, over a few representative sets of languages (benchmarks) for programming and data representation, in comparison with existing parsers for EBNF grammars.

2022 Best Paper Award Runner Up - Journal of Computer Languages

Angelo Borsotti;Luca Breveglieri;Stefano Crespi Reghizzi;Angelo Morzenti

2023-01-01

Abstract

The context-free grammars extended with regular expressions (RE), known as ECF or EBNF grammars, are commonly used as they often allow for terser language definitions. Yet for such grammars the notion of syntax tree, and consequently of ambiguity, lacks an agreed definition. The simplified tree structures returned by the existing parsing algorithms do not faithfully represent all the ways a sentence is derivable by means of the REs present in the grammar rules. We contribute a precise definition of the regular parts in the structures of the EBNF syntax trees, which is aligned with the tree representations that have been adopted by the recent algorithms for RE matching. For an EBNF rule, the finest representation shows all the RE operators as nodes in the sub-tree, while the flat representation simply appends the string generated by the RE to the nonterminal node. A consequent notion of representation-dependent ambiguity follows. The above representations are incorporated into a Tomita-style parsing algorithm, i.e., a GLR(1) parser. To construct such a parser, we follow the positional method of the Berry-Sethi parsers for regular languages. Given an EBNF grammar, our parser-generator produces a parser compliant with the choices expressed by the user about the representation of the syntax trees. We also report the parsing performance, over a few representative sets of languages (benchmarks) for programming and data representation, in comparison with existing parsers for EBNF grammars.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Parole chiave
	
				syntax tree in EBNF grammar, parse tree, GLR parsing, bottom-up parsing, Berry-Sethi algorithm, positional method for regular expression, regular expression parsing, ambiguous regular expression
			
	Appare nelle tipologie:
	
				07.12 Altro

File in questo prodotto:

File	Dimensione	Formato
COLA_Best Paper 2022 Runner Up Certificate_Borsotti et al.pdf accesso aperto : Altro materiale allegato Dimensione 281.36 kB Formato Adobe PDF Visualizza/Apri	281.36 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1260889

Citazioni

ND

ND

ND

social impact