We explore the practical feasibility of a system based on genetic programming (GP) for the automatic generation of regular expressions. The user describes the desired task by providing a set of labeled examples, in the form of text lines. The system uses these examples for driving the evolutionary search towards a regular expression suitable for the specified task. Usage of the system should require neither familiarity with GP nor with regular expressions syntax. In our GP implementation each individual represents a syntactically correct regular expression. We performed an experimental evaluation on two different extraction tasks applied to real-world datasets and obtained promising results in terms of precision and recall, even in comparison to an earlier state-of-the-art proposal.
Automatic Generation of Regular Expressions from Examples with Genetic Programming / Bartoli, Alberto; Davanzo, Giorgio; DE LORENZO, Andrea; Mauri, Marco; Medvet, Eric; Sorio, Enrico. - STAMPA. - (2012), pp. 1477-1478. ( Genetic and Evolutionary Computation Conference (GECCO) Philadelphia, US Luglio 2012) [10.1145/2330784.2331000].
Automatic Generation of Regular Expressions from Examples with Genetic Programming
BARTOLI, Alberto;DAVANZO, GIORGIO;DE LORENZO, ANDREA;MAURI, MARCO;MEDVET, Eric;SORIO, ENRICO
2012-01-01
Abstract
We explore the practical feasibility of a system based on genetic programming (GP) for the automatic generation of regular expressions. The user describes the desired task by providing a set of labeled examples, in the form of text lines. The system uses these examples for driving the evolutionary search towards a regular expression suitable for the specified task. Usage of the system should require neither familiarity with GP nor with regular expressions syntax. In our GP implementation each individual represents a syntactically correct regular expression. We performed an experimental evaluation on two different extraction tasks applied to real-world datasets and obtained promising results in terms of precision and recall, even in comparison to an earlier state-of-the-art proposal.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


