Fortunately, there is a custom control table in this data warehouse that stores a list of all the views that are used as sources, so I just started tracing the code to see how I could modify the macro. Because of this, it makes the default behavior of the macro not very valuable as I would have to do a lot of editing to remove “sources” that will actually become models. All of this is to say that our transformations are built using sources that can be in different schemas, which are intermixed with tables and views that will become dbt models. This environment has a multi-schema architecture with different data layers, while also segregating the data based on its functional contents, such as common dimensions, revenue, people, etc. I recently ran into this issue while implementing dbt on a semi-mature enterprise data warehouse in Snowflake. Since these macros are just code, you can edit it! So that is a problem with this macro…or at least it would be if it were a traditional ETL tool. I can hear you asking yourself right now, what if I don’t want to import all the objects in a schema? What if that schema contains a mix of objects that will be sources and models? Sorry, by default it does not allow you to pick specific sources, it assumes you have schemas that contain only sources. These source definitions go in a YAML file called sources.yml.Īfter installing the package, you just need to invoke the macro, which I did via a run operation and passing the parameters of the schema I want to generate my sources from and whether or not I want it to give me the list of columns in that source. Sources are any object that will be used in a dbt project but do not contain code that will be maintained in the dbt project. The macro in this package I want to highlight is called generate_source, which helps generate source.yml files. This package that has some “accelerators” in it to help new project implementations generate base code. I am going to illustrate the power of these by focusing on one macro in a package called codegen from dbt. Packages can be installed from other projects in your organization, or by using the dbt hub you can install packages from the open-source community and Fishtown Analytics. Pack ages allow you to reuse code across multiple projects in an organization. Anyone who has worked with dbt knows that one of the most powerful features is packages.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |