Understanding how HEDDA.IO helps detect the issues you don’t see at row level
When we talk about data quality, we often think of rules like:
- “Email must be valid.”
- “The contract start date must come before the end date.”
- “If the product is active, it must have a category.”
These are row-level rules — the kind HEDDA.IO calls Business Rules and manages inside Rulebooks. They answer the question:
Is this value right?
But sometimes, bad data hides not in individual records — but in the shape of the data as a whole. That’s where Dataset Rules come into play.
What Are Dataset Rules?
Dataset Rules in HEDDA.IO validate aggregate characteristics of your data — not single values, but statistical and structural patterns.
They help answer questions like:
- Does this field have too many duplicates?
- Are there unexpected outliers in this numeric column?
- Is the percentage of null or negative values unusually high?
- Has the average value shifted compared to last month?
Dataset rules look at the distribution, shape, and completeness of your data — and compare it either to expected thresholds or to previous runs.
They turn “the dataset feels off” into “we detected a 23% drop in distinct product IDs.”
Examples of Dataset Rules in HEDDA.IO
For each column (field), you can configure checks like:
Data Type | Example Dataset Rules |
Numeric | Average, Min/Max, Median, % of zero or negative values, Range Width |
String | Number of distinct values, % of empty/blank |
Boolean | Expected value distribution (e.g., 90% false, 10% true) |
Date | Minimum and maximum date, gap detection, timeline completeness |
All types | % of valid values based on business rules, null rates, uniqueness of records |
How Are They Different from Business Rules?
Aspect | Business Rules | Dataset Rules |
Scope | One row at a time | Whole dataset or single column |
Format | If–then logic, field dependencies | Aggregation, statistics, distribution checks |
Application | Enforced via RuleBooks | Defined globally or per dataset |
Common Use Case | Field correctness, conditional logic | Detecting drift, anomalies, dataset shifts |
Level of detection | Fine-grained, value-level | Structural, behavioral, trend-level |
Business Rules are like reviewing every item in a shipment.
Dataset Rules are like checking whether the box is full, properly packed, or has changed weight since yesterday.
When to Use Dataset Rules?
Dataset rules are particularly valuable when:
- You need to detect sudden changes in data behavior.
- You manage large, repetitive data feeds where not every row can be manually reviewed.
- You want to compare data between runs, sources, or time periods.
- You need early warning indicators of upstream system changes.
- You’re auditing third-party data deliveries for completeness and consistency.
Examples:
- A product table suddenly drops from 12,000 to 6,500 records.
- A field that used to have 98% valid values now has 83%.
- The average invoice amount jumps by 300% overnight.
These patterns may not violate any Business Rule — but they indicate something is wrong.
Why HEDDA.IO Excels Here
HEDDA.IO offers a unique combination:
- Business Rules for correctness
- Rulebooks for logic modularity and reuse
- Dataset Rules for trend and shape validation
This means you can:
- Detect both what is wrong and how it changed.
- Validate individual records and monitor the dataset’s structure.
- Set absolute expectations (e.g., „min value ≥ 0“) or relative ones („no more than 5% change from previous run“).
- Combine all rule types in one validation execution.
- Trigger alerts or workflows if thresholds are crossed.
Whether you’re checking a daily sales load, monitoring lab results, or reviewing product exports from a supplier — HEDDA.IO ensures you’re not only looking at the details, but also at the bigger picture.
Summary
While most data platforms help you check individual values, few give you the tools to watch how your data behaves as a whole.
Dataset Rules close that gap.
And when combined with HEDDA.IO’s rulebooks, run tracking, and exception workflows, they help transform your data quality practice from reactive firefighting into proactive control.
Want to see how Dataset Rules can help you catch what’s not obvious — before it becomes expensive? Let’s talk.
