Technical Deep Dive: dbt

An open source tool for agile analytics.

Another tool under the belt

Open-source tools are a key part of my analytics workflow. I built my first dashboards in R-Studio and Python before moving to Tableau and Looker. dbt is another open-source analytics tool that’s revolutionizing the workflow for data analysis and I just had to try it out.

dbt stands for Data Build Tool and it’s name is exactly what it does. It allows analysts to easily take on tasks that previously have been reserved for data engineers. Now, an analyst (or analytics engineer) can stand up data views and enable analytics quickly without affecting the integrity of core data sources.

This open-source transformation tool has fundamentally changed how analysts work with data by bringing software engineering best practices to the analytics workflow. Data wrangling is a key part of analytics. I’ve used DDL, LookML, and even CSV uploads to create tables and views to enable analytics. dbt has the ability to make that process a lot more enjoyable.

Then and now

Traditionally, data transformation was a complex web of SQL scripts, stored procedures, and manual processes. Analysts would write SQL queries in various tools, maintain complicated documentation, and hope nothing would break when changes were made. Version control? That meant keeping multiple copies of files with dates in their names. Testing? Often an afterthought.

Enter dbt, which brings structure, reliability, and engineering principles to the analytics workflow. In the comparison below, dbt moves the data process to the right, where data is loaded then transformed via dbt. It reduces the initial data engineering load and allows for more flexibility downstream (Source: AWS).

Things I learned about dbt

What makes dbt different is its ability to treat analytics code as a true software engineering project.

  1. Version Control: Every transformation is tracked, versioned, and documented. No more wondering which version of a query is current or who made what changes.

  2. Modular Development: Transformations are broken down into small, reusable models. Need to calculate customer lifetime value? Create it once, reference it everywhere.

  3. Testing and Documentation: Built-in testing ensures data quality and consistent business logic. Automated documentation keeps everyone on the same page.

  4. Dependencies Management: dbt automatically handles the order of operations, showing you exactly how your data flows from raw sources to final outputs.

My favorite feature of dbt is its ability to automatically generate a map of the data flow, or a Directed Acyclic Graph (DAG). You can see below how staging tables are transformed to become other data views used for analysis (Source: dbt blog).

What it means for your day

For analysts, dbt can help with productivity and responding to higher impact ad-hoc requests. Things like speed and quality can balance better when collaborative efforts are documented easily and checked routinely for integrity. For leaders, it means getting dashboards and reports faster with more accurate data. It also means that requests that might have been impossible in the past aren’t anymore. With a clear view of how tables and views are built in dbt, it’s much easier to ramp up quickly and get output. For fast growing and nimble businesses, this can make a huge difference in impact.

The rise of dbt signals a broader shift in how teams can handle data transformation:

  1. Higher Standards: Basic SQL skills aren't enough anymore. It’s better to expect reliable, tested, and documented transformations.

  2. Better Collaboration: Analytics teams can work together more effectively, sharing and reusing each other's work.

  3. Faster Innovation: With solid foundations in place, analysts can spend more time on advanced analytics and less time debugging data issues.

As a fractional analyst, analytics engineering is a crucial component of my work. Scrappy teams need an end-to-end analytics support system that includes a streamlined data pipeline and workflow. dbt is one tool that will surely help me provide that support!