Edit page in Livemark
(2023-09-20 07:42)

Big on Purpose, small on data

"Small data is data that is 'small' enough for human comprehension. It is data in a volume and format that makes it accessible, informative and actionable."

(Wikipedia)


Smaller data can be more powerful, more impactful, more wise!

Photo by Georges Biard, CC BY-SA 3.0, via Wikimedia Commons


Creating our own Data Packages is a way to express the purpose we assign to the data, connect it to the origins and context, take a step towards a new engagement of mutually caring for the data. A bit like urban gardening where people help to water and weed the plants.


Photo by Pixabay, CC0

Even the best packaging can't improve the quality of the data itself, but it helps to set expectations and inform the user.


(Simulated screenshot) In addition to the publications on official portals (BFS), it would help if the files that we downloaded contained metadata built-in. The Properties of spreadsheets and other formats allow for this kind of annotation, but they are somewhat invisible and thus rarely used.


Frictionless Data aims to solve many of the pitfalls of data reuse on the Web with set of simple standards and cross-platform tools. The Data Package Creator quickly generates descriptive metadata (packaging) based on CSV files, which you can use with other tools.


Using GitHub Actions and Frictionless Repository, the Data Package can be automatically validated. This allows you to use public data sources as well as your own automations reliably. You can also pick up a nice badge:


By the way, GitHub has a lot of CSV files lying around, if you care to have a look. But are they ✔️ Accurate ✔️ Authentic ✔️ Appropriate?