HEDDA.IO

Notebook Integration.

In recent years, notebooks have become the go-to standard for development environments in the field of data engineering and data science. These are applications based on web technologies that allow developers to perform live coding, visualization, and documentation in a single environment. Especially in the case of strongly data-driven activities, this can be the decisive factor for focusing on the essentials and achieving success.

Versatile application possibilities

Notebooks cover numerous practical areas of application such as:

data cleansing and data transformation
numerical simulation
statistical modelling
data visualization
machine learning
and much more.

In addition, depending on the environment, notebooks support various popular programming languages such as Python, Scala, .NET, SQL, etc. This makes them versatile and usable for a wide range of purposes by data engineers and data scientists.

Development and execution in different environments

Since notebooks themselves only provide the necessary environment for the development, the execution of the written code is possible on various other environments. For example, the development environment tool Visual Code offers the possibility to execute code directly on one’s own computer. Spark-based systems such as Databricks or Azure Synapse Analytics, on the other hand, enable the user to massively parallelize the written applications via a so-called Spark Cluster and thus gain access to hundreds of cores and terrabytes of memory.

Two different runners

HEDDA.IO currently provides two different HEDDA.IO runtimes, so-called runners, which can be used for the environments .NET Interactive and pyspark. The big advantage is that complex, predefined Data Quality business rules can be applied to the data with just a few lines of code, allowing it to be quickly profiled, validated, and cleansed.

The greatest strength of HEDDA.IO? Unlike many other Data Quality environments, the rules are executed directly on the respective systems and not in the HEDDA.IO environment. This means: The rules are not brought to the data for validation and transformation, but rather the data to the application!

With this innovative approach, HEDDA.IO users not only benefit from the greatest possible flexibility, but can work efficiently and effectively in a familiar environment where they are most productive.

HEDDA.IO runners at a glance

pyspark Runner

With our pyspark Runner, the most complex business rules with large predefined master data, synonyms and phonetic comparisons can be run on millions of data in minutes. This is done directly on a Spark Cluster within Databricks or Azure Synapse Analytics. Don’t lose time waiting for the results!

.NET Runner

Our .NET Runner offers data engineers the possibility to cleanse data within .NET Interactive Notebooks in Visual Code and thus use it directly with ML.NET (Microsoft’s .NET-based machine learning framework). Again, the rules are brought to the data to allow the developer to work in a familiar development environment. With Azure Synapse Analytics, corresponding .NET applications can also be run with .NET for Apache Spark on a Synapse cluster.

Therefore, HEDDA.IO is an exciting new opportunity for data engineers and data scientists to optimize their own working conditions and significantly increase the quality of their output.

WE CREATE

CLEAN DATA EVERY DAY.

GET STARTED

Notebook Integration.

Versatile application possibilities

Development and execution in different environments

Two different runners

HEDDA.IO runners at a glance

pyspark Runner

.NET Runner

WE CREATE

CLEAN DATA EVERY DAY.

A PRODUCT FROM

OUR LOCATIONS

INFOS

RECENT TWEETS

CONTACT US

Subscribe to our Newsletter

Thank you for showing interest in our product, HEDDA.IO. A valid email address is required in order to subscribe to our newsletter.