Thursday, June 1, 2023
HomeBusinessGoodData integrates with dbt | Good Information Finance-money

GoodData integrates with dbt | Good Information Finance-money


Welcome to our new article! ๐Ÿ‘‹ We’ll show the way to combine rapidly and effectively dbt with Good Information utilizing a sequence of Python scripts. Within the earlier publish, How To Construct A Fashionable Information PipelineOn this article, we supplied a information on the way to construct a stable information pipeline that solves typical issues that analytics engineers face. Then again, this new article describes extra in-depth integration with dbt as a result of as we wrote within the article GoodData and dbt Metricswe predict that dbt metrics are good for easy use circumstances however for superior analytics, you want a extra stable software like GoodData.

Even supposing our resolution is tightly coupled with GoodData, we wish to present a basic information on the way to combine with dbt! Let’s begin ๐Ÿš€.

Very first thing first โ€” why would you wish to combine with dbt? Earlier than you begin to write your individual code, it’s a good strategy to do analysis of current dbt plugins first. It’s a identified proven fact that the dbt has a really robust neighborhood with a variety of information professionals. In case your use case just isn’t very unique or proprietary to your resolution, I might wager that there already exists an identical plugin,

One instance is price a thousand phrases. Few months in the past, we had been creating our first prototype with dbt and jumped into an issue with referential integrity constraints. We had mainly two choices:

  1. Write a customized code to resolve the issue.
  2. Discover a plugin that will resolve the issue.

Happily, we discovered a plugin dbt Constraints Package deal after which the answer was fairly easy:

dbt constraints package

Lesson realized: Seek for an current resolution first, earlier than writing any code. In case you nonetheless wish to combine dbt, let’s transfer to the following part.

Implementation: How To Combine With dbt?

Within the following sections, we cowl crucial elements of integration with DBT. If you wish to discover the entire implementation, take a look at the repository,

setup

Earlier than we begin writing customized code, we have to do some setup. First necessary step is to create a profile file,

profile file

It’s mainly a configuration file with the database connection particulars. Attention-grabbing factor right here is the partition between dev and prod. In case you discover the repository, you’ll find that there’s a CI/CD pipeline (described in How To Construct A Fashionable Information Pipeline, The dev and prod environments guarantee that each stage within the pipeline is executed with the best database.

The subsequent step is to create a typical python package deal. It permits us to run the proprietary code inside the dbt surroundings.

setup file

the entire dbt-gooddata package deal is in GitLab. Throughout the package deal, we are able to then run instructions like:

example of command

Transformation

Transformation was essential for our use case. The output of dbt are materialized tables within the so-called output stage schema. The output stage schema is the purpose the place GoodData connects however with the intention to efficiently begin to create analytics (metrics, reviews, dashboards), we have to do a couple of issues first, like hook up with information supply (output stage schema), or – what’s the most attention-grabbing half โ€” convert dbt metrics to GoodData metrics.

Let’s begin with the fundamentals. In GoodData, we now have an idea referred to as the Bodily Information Mannequin (PDM) that describes the tables of your database and represents how the precise information is organized and saved within the database. Based mostly on the PDM, we additionally create a Logical Information Mannequin (LDM) which is an summary view of your information in GoodData. The LDM is a set of logical objects and their relationships that characterize the info objects and their relationships in your database by the PDM.

If we use extra easy phrases that are frequent in our business โ€” PDM is tightly coupled with a database, LDM is tightly coupled with analytics (GoodData). Virtually every little thing you do in GoodData (metrics, reviews) is predicated on the LDM. Why will we use the LDM idea? Think about you alter one thing in your database, for instance, the title of a column. If GoodData didn’t have the extra LDM layer, you would wish to alter the column title in each place (each metric and each report, and so forth.). With LDM, you solely change one property of the LDM, and the modifications are routinely propagated all through your analytics. There are different advantages too, however we won’t cowl them right here โ€” you’ll be able to examine them in the documentation,

We coated just a little concept, let’s verify the extra attention-grabbing half. How will we create PDM, LDM, Metrics, and so forth.? from dbt generated output stage schemas? To start with, a schema description is the last word supply of reality for us:

models

You may see that we use dbt commonplace issues like date_type however we additionally launched metadata that helps us with changing issues from dbt to GoodData. For the metadata, we created information lessons that information us in utility code:

example of data class

The information lessons can be utilized in strategies the place we create LDM objects (for instance, date datasets):

example of method

You may see that we work with metadata which helps us to transform issues accurately. We use the end result from the strategy make_date_datasets, along with different outcomes, to create a LDM in GoodData by its API, or extra exactly with the assistance of GoodData Python SDK,

example of method

For individuals who wish to additionally discover how we convert dbt metrics to GoodData metrics, you’ll be able to verify the entire implementation,

massive footage

We perceive that the earlier chapter might be overwhelming. Earlier than the demonstration, let’s simply use one picture to point out the way it works for higher understanding.

architectural diagram

Demonstration: Generate Analytics From dbt

For the demonstration, we skip the extract half and begin with transformation, which implies that we have to run dbt:

example of command

The result’s output stage schema with the next construction:

structure of database

Now, we have to get this output to GoodData to begin analyzing information. Usually, you would wish to do a couple of guide steps both within the UI or utilizing API / GoodData Python SDK. Due to integration described within the implementation part, just one command must be run:

example of command

Listed here are the logs from the profitable run:

successful result

The ultimate result’s a efficiently created Logical Information Mannequin (LDM) in GoodData:

logical data model

The final step is to deploy dbt metrics to GoodData metrics. The command is just like the earlier one:

example of command

Listed here are the logs from the profitable run:

successful result

Now, we are able to verify how the dbt metric was transformed to a GoodData metric:

comparison of metrics

Crucial factor is that you would be able to now use the generated dbt metrics and construct extra advanced metrics in GoodData. You may then construct reviews and dashboards and, as soon as you’re proud of the end result, you’ll be able to retailer the entire declarative analytics utilizing one command and model in git:

example of command

For these of you who like automation, you’ll be able to take inspiration from our article the place we describe the way to automate information analytics utilizing CI/CD,

What subsequent?

The article describes our strategy to integration with DBT. It’s our very first prototype and with the intention to productize it, we would wish to finalize a couple of issues after which publish the mixing as a stand alone plugin. We hope that this text can function an inspiration in your firm, if you happen to determine to combine with dbt. In case you take one other strategy, we would love to listen to that! Thanks for studying!

If you wish to strive it by yourself, you’ll be able to register for the GoodData trial and play with it by yourself.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments

Index