Occurrence Cubes: A new way of aggregating heterogeneous species occurrence data

Damiano Oldoni, Quentin Groom, Tim Adriaens, Jasmijn Hillaert, Amy J. S. Davis, Lien Reyserhove, Diederik Strubbe, Sonia Vanderhoeven, Peter Desmet

Onderzoeksoutput: Bijdrage aan tijdschriftA2: Artikel in een tijdschrift met peer review, dat niet inbegrepen is in A1peer review

1572 Downloads (Pure)


The digital era has brought about an impressive increase in the volume of published species occurrence data. Research infrastructures such as the Global Biodiversity Information Facility (GBIF), the digitization of legacy data, and the use of mobile applications have all played a role in this transition. More data implies, unavoidably, more heterogeneity at multiple levels as a result of the different methods and standards used to collect data. Data standardization and aggregation help to reduce this heterogeneity. Furthermore, intermediate data products that can be used for activities such as mapping, modeling and monitoring improve the repeatability and reproducibility of biodiversity research (Kissling et al. 2017).Occurrences can be defined as events in a three-dimensional space where the dimensions are taxonomic (what), temporal (when) and spatial (where). They are then aggregated into what we coined occurrence cube (Fig. 1).The taxonomic dimension is categorical. Research infrastructures like GBIF use a taxonomic backbone, thus making data aggregation at species level or higher rank relatively easy. The temporal dimension is a continuum and the temporal uncertainty is usually lower than the typical aggregation span, typically a year. Regarding the spatial dimension, occurrences are typically filtered to remove those with too large an uncertainty to fit the grid scheme being used. Meaning that the spatial uncertainty is largely unused. We developed a method to take into account this spatial uncertainty while aggregating data. In particular, we state that an occurrence is spatially representable as a closed plane figure such as a circle, hexagon or square, never as the geometric centre (centroid) of it. As for GBIF occurrence data, the coordinateUncertaintyInMeters is defined as the radius describing the smallest circle containing the whole of the location (see Darwin Core standard). So, spatially speaking, we refer to occurrences as circles, even if the method described below is general.After harvesting the occurrence data and providing a data quality assessment (e.g. removing occurrences without coordinates or with suspicious coordinates) we can assign occurrences to a reference grid such as the European reference grid of the European Environment Agency (EEA) at 1 km scale. In this spatial aggregation we randomly choose a point within the occurrence circle and assign it to the grid cell in which it is contained. We can aggregate further by time (e.g. by year) and taxonomy (e.g. by species), where aggregating means counting how many occurrences are in each specific taxonomic-spatial-temporal unit.The analogy with geometry goes further: the occurrence cube can, as any cube, be projected on an orthogonal plane by aggregating along one of the three dimensions. In particular, projecting the cube on the taxonomic and temporal dimensions can be done by adding up the number of occurrences, or counting the number of occupied cells, thus estimating the area of occupancy.The occurrence cube paradigm has been developed within the Tracking Invasive Alien Species (TrIAS) project (Vanderhoeven et al. 2017) following Open Science and FAIR principles. We created and published occurrence cubes at the species level for Belgium and Italy (Oldoni et al. 2020b) and the occurrence cubes for non-native taxa in Belgium and Europe (Oldoni et al. 2020a).
Oorspronkelijke taalEngels
Artikel nummer59154
TijdschriftBiodiversity Information Science and Standards
Pagina's (van-tot)1-4
Aantal pagina’s4
PublicatiestatusGepubliceerd - 30-sep-2020

Thematische Lijst 2020

  • Data & Infrastructuur


  • beslissingsinstrumentarium
  • indicatoren

Geografische lijst

  • Wereld
  • Europa
  • België


  • automatisatie
  • modellering
  • statistiek en modellering
  • ondersteunende technieken


Bekijk de onderzoeksthema's van 'Occurrence Cubes: A new way of aggregating heterogeneous species occurrence data'. Samen vormen ze een unieke vingerafdruk.

Dit citeren