Monitoring
All the actions and goals defined in your data quality strategy need to be actively monitored. Utilising monitoring tools that can build alerts and communicate through various channels is essential for early detection.
Also it is crucial to log your incidents, and categorise them based on their impacted dimensions. This practice allows you to focus your attention on specific areas and identify potential gaps in your strategy. Even better, if you maintain an incident report, it enables you to reflect on how your work in specific areas contributes to reducing the number of incidents over time.
Periodical revisions of the framework
Your team must review the incident log periodically and update your data quality framework accordingly to fill the identified gaps. This ensures your actions and goals reflect reality and are up to date.
Service Level Indicators and Transparency
It is essential to measure the fulfilment of your Service Level Objectives. For every SLO, you should have a Service Level Indicator (SLI) that shows the fulfilment of the SLO. For instance, in our example you could have a SLI that shows the percentage of success in the last X days of not having data that is older than 6 hours in production (timeliness dimension). This helps users understand how the data behaves and builds trust in its quality.
Transparency in practice is key to increase user adoption and Service Level Indicators are the ones in charge of providing this transparency.
For sharing our data quality metrics (SLIs), I really like embracing data product concept within a data-mesh implementation.
Our data quality strategy has these characteristics:
- It is domain specific as the objectives comes from a business need
- Transparent as we can share and want to share it with users
- Visible as our data quality framework is easy to interpret
This aligns perfectly with the definition data-mesh gives to data products. I totally recommend using a data-mesh approach encapsulating data and its quality metrics into data products to enhance transparency.
Why data products for sharing our data quality metrics
Per definition, a data product in data-mesh is a self-contained, domain-specific unit of data capabilities. They encapsulate data, processing logic and data quality checks, promoting decentralised data ownership and seamless integration into the broader data ecosystem. They are designed to serve specific business needs in a specific domain. They are easily findable and transparent. As integral components of our data quality framework, data products ensure that our strategy aligns precisely with the unique requirements of each domain, providing visibility and transparency for domain-specific data quality.
One of the key advantages of data products in the context of data quality is their ability to hold their own SLIs. By integrating data quality indicators directly into the data products and making them visible through a user-friendly catalog, we empower users to search, request access, and explore data with full knowledge of its quality. This transparency and visibility enhance user confidence and encourage greater adoption.
This post originally appeared on TechToday.