The availability of biodiversity databases is expanding at unprecedented rates. Nevertheless, species occurrence data can be intrinsically biased and contain uncertainties that impact the accuracy and reliability of biodiversity estimates. In this study, we developed a reproducible framework to assess three dimensions of bias—taxonomic, spatial, and temporal—as well as temporal uncertainty associated with data collections. We utilized the European vascular plants records from sPlotOpen, an open-access database, as a case study. The metrics proposed for estimating bias include completeness of the species richness for taxonomic bias, Nearest Neighbor Index for spatial bias, and Pielou’s index for temporal bias. Additionally, we introduced a new method based on a negative exponential curve to model the temporal decay in biodiversity data, aiming to quantify temporal uncertainty. Finally, we assessed the sampling bias considering the influence of various spatial determinants (i.e, roads density, human population count, Natura 2000 network and topographic roughness). We discovered that the facets of bias and the temporal uncertainty varied throughout Europe, as did the different roles played by spatial determinants in determining biases. sPlotOpen showed a clustered distribution of the vegetation plots, and an uneven distribution in sampling completeness, year of sampling and temporal uncertainty. The variance of the facets of bias was significantly explained mainly by the presence of Natura 2000 network and the topographic roughness. In light of the results, we believe that employing an efficient procedure to examine biases and uncertainties in data collections can enhance data quality and provide more reliable biodiversity estimates.
Addressing multiple facets of bias and uncertainty in continental scale biodiversity databases
Enrico Tordoni;Daniele Da Re;Giovanni Bacaro;
2024-01-01
Abstract
The availability of biodiversity databases is expanding at unprecedented rates. Nevertheless, species occurrence data can be intrinsically biased and contain uncertainties that impact the accuracy and reliability of biodiversity estimates. In this study, we developed a reproducible framework to assess three dimensions of bias—taxonomic, spatial, and temporal—as well as temporal uncertainty associated with data collections. We utilized the European vascular plants records from sPlotOpen, an open-access database, as a case study. The metrics proposed for estimating bias include completeness of the species richness for taxonomic bias, Nearest Neighbor Index for spatial bias, and Pielou’s index for temporal bias. Additionally, we introduced a new method based on a negative exponential curve to model the temporal decay in biodiversity data, aiming to quantify temporal uncertainty. Finally, we assessed the sampling bias considering the influence of various spatial determinants (i.e, roads density, human population count, Natura 2000 network and topographic roughness). We discovered that the facets of bias and the temporal uncertainty varied throughout Europe, as did the different roles played by spatial determinants in determining biases. sPlotOpen showed a clustered distribution of the vegetation plots, and an uneven distribution in sampling completeness, year of sampling and temporal uncertainty. The variance of the facets of bias was significantly explained mainly by the presence of Natura 2000 network and the topographic roughness. In light of the results, we believe that employing an efficient procedure to examine biases and uncertainties in data collections can enhance data quality and provide more reliable biodiversity estimates.File | Dimensione | Formato | |
---|---|---|---|
6_BI_18_1_2024_Rocchini+9.30.24.pdf
accesso aperto
Descrizione: articolo
Tipologia:
Documento in Versione Editoriale
Licenza:
Creative commons
Dimensione
3.46 MB
Formato
Adobe PDF
|
3.46 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.