Data Quality
Data quality measures the reliability of your data. High‑quality data is accurate, complete, timely, consistent across systems, standard-conformant, and free of duplication.
In depth
Data quality is fitness for use. The same dataset may be fine for trend‑spotting but unsuitable for board reporting.
Here are six commonly-used measures for evaluating data quality:
- Accuracy: Values reflect reality. For example, a paid invoice shows the right amount and date.
- Completeness: All required fields are present. For example, there are no missing customer IDs or product SKUs.
- Timeliness: You have data when you need it. For example, the sales team gets access to daily sales targets before the morning stand‑up.
- Consistency: Data aligns across systems. For example, revenue values in your warehouse dashboard match revenue values in your finance report.
- Validity: Values conform to rules and standards. For example, dates parse, currencies use standard formats, and email addresses pass format checks.
- Uniqueness: No duplicated data. For example, one order, one row.
Pro tips
Treat your most important fields like a contract. Define the expected type, allowed values, and update cadence. Add tests for grain, time zones, and late backfills. Log rule failures so you can spot drift early.
Keep an eye on the areas where quality can break: manual entry, schema shift, late-arriving events, joins at the wrong grain, and quiet pipeline failures. Small gaps and lapses in data can stack up and become larger problems. A single missing dimension can skew cohort charts. A stale table can trigger the wrong alert.
Measure data quality directly. Track data freshness per table, null and outlier rates per column, join coverage across keys, and pass rates for validation rules. Review trends weekly, then fix the issues that move business outcomes.
Why Data Quality matters
- Trusted decisions: No more meetings where different teams bring different numbers.
- Faster delivery: Clean, up-to-date data for reliable, fast dashboard and report building.
- Lower risk: Accurate, complete data reduces non-compliance and generates fewer customer issues.
- Better self‑serve: Business users can make confident, informed decisions based on consistently-defined data.
Data Quality - in practice
- Ecommerce orders: A duplicate data load inflates “Gross Sales” by 3 percent. A uniqueness rule on order_id catches this issue before it reaches executive dashboards.
- Marketing attribution: Incomplete UTM parameters. A completeness rule flags missing medium and campaign parameters so the team can correct or drop those rows.
- Revenue reporting: A warehouse data refresh runs late. A latency monitor warns the finance department before monthly close so they pause exports.
- Customer success: A join on account_id changes grain after a schema update. A grain test fails, blocking the deployment and preventing mismatched data.
Data Quality and PowerMetrics
PowerMetrics helps you manage data quality starting at the metric layer. You define metric calculations once, certify trusted versions, and tag them for discovery. Access control for users, groups, and roles keeps editing rights clear. A metrics‑aware query layer in PowerMetrics Query Language (PMQL) applies time logic, filters, and groups consistently, reducing calculation drift between tools. Error propagation surfaces issues when inputs are null or out of bounds. Goals and notifications alert you when a metric fails to meet set targets or its value steps outside an expected range.
Related terms
Metric Tree
A metric tree is a visual or conceptual model that maps how key business metrics relate to each other. It links a top‑level outcome, like revenue or retention, to the contributing drivers that explain changes underneath. You get a clear, shared view of cause and effect across teams.
Read moreData Warehouse
A data warehouse is a centralized repository that stores and organizes structured data from multiple sources. It’s optimized for reporting and analysis, enabling businesses to get a unified view of their historical and current data.
Read moreData Lineage
Data lineage maps the journey of your data from origin to destination. It visually shows where data comes from, how it’s transformed, and where it’s used.
Read moreData Governance
Data governance is the system of people, policies, and tools that keeps data accurate, secure, and usable across your company. Think of it like hiring a skilled librarian for a massive library. Every book is cataloged, protected, and easy to find, so readers trust what they pick up and can act quickly. With solid governance, your team works from the same definitions, follows clear rules for access and use, and treats data as a business asset.
Read moreData Catalog
A data catalog is an organized inventory of a company’s data assets. This centralized, access-controlled library typically lists datasets, tables, and fields alongside owners, definitions, and lineage so people can search, understand, and use data with confidence.
Read more