Data Build Tool (DBT) : A niche SQL based Data Transformation tool for the Modern Data Warehouse
The data engineering landscape today is like a Kaleidoscope , with multitude of tools vying for their space. The Big 3 cloud players AWS, Azure and GCP have been pitching their specific products for each of the data needs in Database, Datalake, Datawarehouse ,Dashboard & Data Science space. Niche players like Databricks, Snowflake , Apache Airflow have carved out a separate space for them and are taking the big guys head on in their respective product stacks.
Over the last few weeks, i had the opportunity to work on Data Build Tool (DBT) which has made a mark for itself in the Datawarehousing space. Unlike the other ETL tools such as Azure data factory, AWS Glue etc, DBT doesn't perform the Extract or Load in the ETL. It only does the ‘T’ : Transform in the ETL.
So an obvious question arises why would anyone want to invest on a separate tool just to perform the ‘T’ while there are sophisticated tools which does all the ‘E’,’T’ & ‘L’. The answer to this lies in the extreme agility in needs of Analytics today. Rapidly changing business rules require an equally agile system and an analytics team which can quickly put the changes into effect and start showing the results.
Traditionally the ETL engineers used to put a datawarehouse in production and any major changes to it were minimal and large organizations maintained tons of ETL pipelines running on a regular basis. Even a minor change in the ‘T’ required a change request with ETL engineer taking at least a week’s time for making the change and managing it’s life cycle before deploying it into production.
Smaller and Agile organizations today cannot afford the highly priced ETL tools and more so cannot afford the time consuming process of Analysts reaching out to ETL engineers for minor changes and waiting for days and weeks.
This is where DBT makes the difference. Once the RAW data gets loaded in the staging schema of a Datawarehouse, DBT takes over the remaining path till the loading of data to the final Data warehouse tables. The magic lies in the fact that the only skill required to operate in DBT is ‘SQL’. With in built devops and DBT available both in the cloud and command line options, Business Analysts with robust business domain experience and working knowledge of SQL can build the datawarehouse ground up. They do not need a separate dedicated ETL team for this. All further changes can be carried out by the same team of Analysts.
A landscape architecture depicting where the DBT tool fits in the current datawareouse landscape is shown below,
With a growing clientele, popularity of DBT is on the rise and is making impact to the spectrum of work that Business Analysts can now do with their SQL skills. In the next articles, i will further discuss on the Data warehousing use cases and share how DBT accomplishes it. Stay tuned ..!