Regular expressions are systematically used in a number of different application domains. Writing a regular expression for solving a specific task is usually quite difficult, requiring significant technical skills and creativity. We have developed a tool based on Genetic Programming capable of constructing regular expressions for text extraction automatically, based on examples of the text to be extracted. We have recently demonstrated that our tool is human-competitive in terms of both accuracy of the regular expressions and time required for their construction. We base this claim on a large-scale experiment involving more than 1700 users on 10 text extraction tasks of realistic complexity. The F-measure of the expressions constructed by our tool was almost always higher than the average F-measure of the expressions constructed by each of the three categories of users involved in our experiment (Novice, Intermediate, Experienced). The time required by our tool was almost always smaller than the average time required by each of the three categories of users. The experiment is described in full detail in "Can a machine replace humans? A case study. IEEE Intelligent Systems, 2016"

On the Automatic Construction of Regular Expressions from Examples (GP vs. Humans 1-0)

BARTOLI, Alberto;DE LORENZO, ANDREA;MEDVET, Eric;TARLAO, FABIANO
2016-01-01

Abstract

Regular expressions are systematically used in a number of different application domains. Writing a regular expression for solving a specific task is usually quite difficult, requiring significant technical skills and creativity. We have developed a tool based on Genetic Programming capable of constructing regular expressions for text extraction automatically, based on examples of the text to be extracted. We have recently demonstrated that our tool is human-competitive in terms of both accuracy of the regular expressions and time required for their construction. We base this claim on a large-scale experiment involving more than 1700 users on 10 text extraction tasks of realistic complexity. The F-measure of the expressions constructed by our tool was almost always higher than the average F-measure of the expressions constructed by each of the three categories of users involved in our experiment (Novice, Intermediate, Experienced). The time required by our tool was almost always smaller than the average time required by each of the three categories of users. The experiment is described in full detail in "Can a machine replace humans? A case study. IEEE Intelligent Systems, 2016"
File in questo prodotto:
File Dimensione Formato  
2016-GECCO-OnTheAutomaticConstructionOfRegularExpressions.pdf

Accesso chiuso

Descrizione: Articolo principale
Tipologia: Documento in Versione Editoriale
Licenza: Digital Rights Management non definito
Dimensione 740.73 kB
Formato Adobe PDF
740.73 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
2016_GECCO_HotOffThePress.pdf

accesso aperto

Descrizione: Articolo principale
Tipologia: Bozza finale post-referaggio (post-print)
Licenza: Digital Rights Management non definito
Dimensione 108.54 kB
Formato Adobe PDF
108.54 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/2877785
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact