Good data quality within companies is based, among other things, on uniform, clearly defined rules and well-maintained master or reference data that are applied to the company data for validation and cleansing. However, the only way for companies to know how good the quality of the data will turn out in the end is through comprehensive statistics and information over the entire life cycle of their data – freely according to the much-quoted saying in IT: “You can’t control what you can’t measure”.
In order to make clear statements about the quality of your data, HEDDA.IO offers comprehensive statistics at various levels of granularity of your data. Information at project or knowledge base level can be obtained just as quickly as data on individual executions. In addition to valuable KPIs (such as the current Data Quality Score), detailed information on errors in executions is also displayed at row level.
When using the HEDDA.IO runner within notebooks, the various statistics can also be displayed directly through a widget within the notebook, enabling the data engineer or data scientist to always have comprehensive statistics at hand within their processes and developments.
Integrated data profiling also provides additional statistics that can be displayed directly in HEDDA.IO or notebooks. This allows data engineers and data scientists to gain insights into their data at earlier stages of development.