From Row Checks to Patterns: Why Dataset Rules Matter in Data Quality

Understanding how HEDDA.IO helps detect the issues you don’t see at row level 

When we talk about data quality, we often think of rules like: 

  • “Email must be valid.” 
  • “The contract start date must come before the end date.” 
  • “If the product is active, it must have a category.”

These are row-level rules — the kind HEDDA.IO calls Business Rules and manages inside Rulebooks. They answer the question: 

Is this value right? 

But sometimes, bad data hides not in individual records — but in the shape of the data as a wholeThat’s where Dataset Rules come into play. 

What Are Dataset Rules? 

Dataset Rules in HEDDA.IO validate aggregate characteristics of your data — not single values, but statistical and structural patterns. 

They help answer questions like: 

  • Does this field have too many duplicates? 
  • Are there unexpected outliers in this numeric column? 
  • Is the percentage of null or negative values unusually high? 
  • Has the average value shifted compared to last month?

     

Dataset rules look at the distribution, shape, and completeness of your data — and compare it either to expected thresholds or to previous runs. 

They turn “the dataset feels off” into “we detected a 23% drop in distinct product IDs.” 

Examples of Dataset Rules in HEDDA.IO 

For each column (field), you can configure checks like: 

Data Type 

Example Dataset Rules 

Numeric 

Average, Min/Max, Median, % of zero or negative values, Range Width 

String 

Number of distinct values, % of empty/blank 

Boolean 

Expected value distribution (e.g., 90% false, 10% true) 

Date 

Minimum and maximum date, gap detection, timeline completeness 

All types 

% of valid values based on business rules, null rates, uniqueness of records 

How Are They Different from Business Rules? 

Aspect 

Business Rules 

Dataset Rules 

Scope 

One row at a time 

Whole dataset or single column 

Format 

If–then logic, field dependencies 

Aggregation, statistics, distribution checks 

Application 

Enforced via RuleBooks 

Defined globally or per dataset 

Common Use Case 

Field correctness, conditional logic 

Detecting drift, anomalies, dataset shifts 

Level of detection 

Fine-grained, value-level 

Structural, behavioral, trend-level 

Business Rules are like reviewing every item in a shipment. 
Dataset Rules are like checking whether the box is full, properly packed, or has changed weight since yesterday. 

When to Use Dataset Rules? 

Dataset rules are particularly valuable when: 

  • You need to detect sudden changes in data behavior. 
  • You manage large, repetitive data feeds where not every row can be manually reviewed. 
  • You want to compare data between runs, sources, or time periods. 
  • You need early warning indicators of upstream system changes.
  • You’re auditing third-party data deliveries for completeness and consistency.
     

Examples: 

  • A product table suddenly drops from 12,000 to 6,500 records. 
  • A field that used to have 98% valid values now has 83%. 
  • The average invoice amount jumps by 300% overnight.
     

These patterns may not violate any Business Rule — but they indicate something is wrong. 

Why HEDDA.IO Excels Here 

HEDDA.IO offers a unique combination: 

  1. Business Rules for correctness 
  2. Rulebooks for logic modularity and reuse
  3. Dataset Rules for trend and shape validation
     

This means you can: 

  • Detect both what is wrong and how it changed. 
  • Validate individual records and monitor the dataset’s structure. 
  • Set absolute expectations (e.g., „min value ≥ 0“) or relative ones („no more than 5% change from previous run“). 
  • Combine all rule types in one validation execution. 
  • Trigger alerts or workflows if thresholds are crossed.
     

Whether you’re checking a daily sales load, monitoring lab results, or reviewing product exports from a supplier — HEDDA.IO ensures you’re not only looking at the details, but also at the bigger picture. 

Summary 

While most data platforms help you check individual values, few give you the tools to watch how your data behaves as a whole. 

Dataset Rules close that gap. 

And when combined with HEDDA.IO’s rulebooks, run tracking, and exception workflows, they help transform your data quality practice from reactive firefighting into proactive control. 

Want to see how Dataset Rules can help you catch what’s not obvious — before it becomes expensive? Let’s talk. 

Hedda.io_primarylogo_orange_white_text

HEDDA.IO is a modern data quality platform that transforms domain knowledge into automated, scalable data validation and governance.

Contact us

A product by

oh22_Logo_weiss_RGB