In the current era of big data, high volumes of valuable information are available in collections of documents, the web, social networks, and high varieties of linked data. To search and retrieve useful information from these linked data, users often enter queries into information retrieval (IR) systems. Among the information retrieved by these systems, some information is relevant to the user queries (i.e., Interested to the users), but some is not. Moreover, some relevant information may not be retrieved by the systems. The effectiveness of these IR systems is often measured by metrics such as precision and recall. Most of the conventional IR systems (e.g., For web searches) aim to achieve high precision (i.e., High percentage of the retrieved information is relevant) at the price of low recall (i.e., Low percentage of the relevant information is retrieved). However, there are real-life situations (e.g., Patent searches) in which having high recall is desirable. In this paper, we present two high-recall IR systems. Results of our evaluation show the effectiveness of our systems in providing high-recall IR from linked big data.

High-Recall Information Retrieval from Linked Big Data

CUZZOCREA, Alfredo Massimiliano;
2015

Abstract

In the current era of big data, high volumes of valuable information are available in collections of documents, the web, social networks, and high varieties of linked data. To search and retrieve useful information from these linked data, users often enter queries into information retrieval (IR) systems. Among the information retrieved by these systems, some information is relevant to the user queries (i.e., Interested to the users), but some is not. Moreover, some relevant information may not be retrieved by the systems. The effectiveness of these IR systems is often measured by metrics such as precision and recall. Most of the conventional IR systems (e.g., For web searches) aim to achieve high precision (i.e., High percentage of the retrieved information is relevant) at the price of low recall (i.e., Low percentage of the relevant information is retrieved). However, there are real-life situations (e.g., Patent searches) in which having high recall is desirable. In this paper, we present two high-recall IR systems. Results of our evaluation show the effectiveness of our systems in providing high-recall IR from linked big data.
9781467365642
9781467365635
File in questo prodotto:
File Dimensione Formato  
high recall.pdf

non disponibili

Tipologia: Documento in Versione Editoriale
Licenza: Digital Rights Management non definito
Dimensione 223.24 kB
Formato Adobe PDF
223.24 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/2871927
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 20
  • ???jsp.display-item.citation.isi??? 15
social impact