Skip to main navigation Skip to search Skip to main content

Frictionless Data 101: How to document and validate datasets or create community standards with the Data Package standard

Activity: Talk or presentation typesLecture and oral contribution

Description

Data Package is an open, generic, lightweight, and extensible standard for describing datasets, files and tabular data. First released in 2007 by the Open Knowledge Foundation (OKFN) to address challenges in open government data, it is now used in multiple domains to document datasets, improve data quality, facilitate reproducible workflows, and develop community standards. The standard is supported by comprehensive open source software tools (including in Python, R and Javascript) and an active maintenance group. In this talk I will cover:

- The Data Package standard, covering its specifications for describing datasets, files, file formats and tabular data.
- How to document a dataset as a data package, increasing its Findability, Accessibility, Interoperability, and Reuse (FAIR).
- How to use software to read, describe, validate and create data packages, either in a graphical user interface or as part of a reproducible workflow.
- How to leverage the Data Package standard to address domain-specific needs or develop community standards—as has been done for Darwin Core Data Package and Camtrap DP.

What new features are introduced in Data Package verson 2 (released in 2024), development plans, and how to contribute.
Period24-Oct-2025
Event titleLiving Data 2025: Joint TDWG, GBIF, GEO BON and OBIS conference
Event typeConference
LocationBogota, ColombiaShow on map
Degree of RecognitionInternational