Data sources - _src_[sourcename].yml
#
Inside each source subfolder (jaffle_shop and google_analytics) located in models/staging folder, we need to create _src_[sourcename].yml
yaml file containing table reference to the raw data. Sources make it possible to name and describe the data loaded into your warehouse. We will setup up these files in the following Exercise on the next page.
📝 example of _src_[sourcename].yml
version: 2
sources:
- name: test
schema: raw # schema where it sits
description: [Optional] # describe source table
tables:
- name: table_name
Defining tests#
Additionally, in _src_[sourcename].yml
we can add tests that for example ensure a column contains no duplicates (unique
) or zero null values (not_null
). Once these tests are defined, you can validate their correctness by running dbt test
in command line.
version: 2
sources:
- name: test
schema: raw # schema where it sits
description: [Optional] # describe source table
tables:
- name: table_name
columns:
- name: id
tests:
- unique
- not_null
📝 example for _src_jaffle_shop.yml
version: 2
sources:
- name: jaffle_shop
schema: raw
tables:
- name: orders
columns:
- name: id
tests:
- unique
- name: customers
columns: id
- name:
tests:
- unique
- name: payments
columns:
- name: id
tests:
- unique