Exercise#
Load data:
Jaffle Shop data : jaffle_shop data are already loaded in our postgres database:
orders
,customers
andpayments
, so there is no action needed.Google Analytics data : traffic data are located in
ga_traffic.csv
file in seeds folder.Add this code snippet into
dbt_project.yml
file at the end. This will createdbt_seeds
schema inside your database once you rundbt seed
command.seeds: +schema: seeds
Run
dbt seed --select ga_traffic
command in dbt docker container which will load these .csv files into our database. You can double check by looking intodbt_seeds
schema in DB Adminer.
Create source files: create a source file
_src_[sourcename].yml
for each data source: jaffle_shop and google_analytics in their folders.Configure source files: edit both
_src_[sourcename].yml
files according to the example.Add tests: Define
unique
test forid
column in each jaffle_shop table anddate
column in ga_traffic table.Test the solution: Test the uniqueness by running
dbt test
command in dbt container.
Solution#
_src_jaffle_shop.yml
version: 2
sources:
- name: jaffle_shop
schema: raw
tables:
- name: orders
columns:
- name: id
tests:
- unique
- name: customers
columns: id
- name:
tests:
- unique
- name: payments
columns:
- name: id
tests:
- unique
_src_google_analytics.yml
version: 2
sources:
- name: google_analytics
schema: dbt_seeds
tables:
- name: ga_traffic
columns:
- name: date
tests:
- unique