In this article we will continue our dbt journey. With dbt-core installed we can ask dbt to set up a new project for us.
Simply go to the root folder where your project folder shall be created and run dbt init projectname. dbt will then ask you for the database adapter and the connection settings as well as for the schema dbt is to build its objects in.
After these questions and answers dbt tells you that you may run dbt debug now to check if everything is fine, BUT: you need to switch first to the subdirectory you created otherwise there will be a meaningless error message:
Encountered an error:
Internal Error
Profile should not be None if loading profile completed
If run in the correct subdirectory dbt debug should not produce any errors. If it does, run the error message through the search engine of your choice.
C:\Users\UWe\Desktop\fx_project>dbt debug
16:52:20 Running with dbt=1.10.13
16:52:20 dbt version: 1.10.13
16:52:20 python version: 3.12.6
16:52:20 python path: C:\Python312\python.exe
16:52:20 os info: Windows-11-10.0.26200-SP0
16:52:20 Using profiles dir at C:\Users\UWe\.dbt
16:52:20 Using profiles.yml file at C:\Users\UWe\.dbt\profiles.yml
16:52:20 Using dbt_project.yml file at C:\Users\UWe\Desktop\fx_project\dbt_project.yml
16:52:20 adapter type: postgres
16:52:20 adapter version: 1.9.1
16:52:20 Configuration:
16:52:20 profiles.yml file [OK found and valid]
16:52:20 dbt_project.yml file [OK found and valid]
16:52:20 Required dependencies:
16:52:20 - git [OK found]
16:52:20 Connection:
16:52:20 host: localhost
16:52:20 port: 5432
16:52:20 user: postgres
16:52:20 database: FX
16:52:20 schema: dbt
16:52:20 connect_timeout: 10
16:52:20 role: None
16:52:20 search_path: None
16:52:20 keepalives_idle: 0
16:52:20 sslmode: None
16:52:20 sslcert: None
16:52:20 sslkey: None
16:52:20 sslrootcert: None
16:52:20 application_name: dbt
16:52:20 retries: 1
16:52:20 Registered adapter: postgres=1.9.1
16:52:20 Connection test: [OK connection ok]
16:52:20 All checks passed!
In the project folder you find a set of subfolders, remove the models/example subfolder, as it may cause error messages.
Next we will create some basic models, using YAML files.
In models, create sources.yml with the following content:
version: 2
sources:
- name: raw
schema: raw
tables:
- name: ecb_fx_rates |
version: 2
sources:
- name: raw
schema: raw
tables:
- name: ecb_fx_rates
In models, create a subfolder staging and add stg_fx_rates.sql to this staging subfolder with the following content:
{{ config(materialized='view') }}
SELECT
fx_date,
currency,
rate,
created_at
FROM {{ SOURCE('raw', 'ecb_fx_rates') }} |
{{ config(materialized='view') }}
select
fx_date,
currency,
rate,
created_at
from {{ source('raw', 'ecb_fx_rates') }}
This will simply select all the data we have in our raw DB schema.
Next comes is the interesting part. Each row gets a valid_from and valid_to date using a window function.
In models, create a subfolder marts and add fx_rates_with_validity.sql to this marts subfolder with the following content:
{{ config(materialized='table') }}
WITH base AS (
SELECT
fx_date,
currency,
rate
FROM {{ REF('stg_fx_rates') }}
),
with_validity AS (
SELECT
currency,
fx_date AS valid_from,
-- valid_to is the day before the next rate, or null if it's the latest
lead(fx_date) OVER (
partition BY currency
ORDER BY fx_date
) - INTERVAL '1 day' AS valid_to,
rate,
-- flag the currently active rate
CASE
WHEN lead(fx_date) OVER (
partition BY currency ORDER BY fx_date
) IS NULL THEN TRUE
ELSE FALSE
END AS is_current
FROM base
)
SELECT * FROM with_validity
ORDER BY currency, valid_from |
{{ config(materialized='table') }}
with base as (
select
fx_date,
currency,
rate
from {{ ref('stg_fx_rates') }}
),
with_validity as (
select
currency,
fx_date as valid_from,
-- valid_to is the day before the next rate, or null if it's the latest
lead(fx_date) over (
partition by currency
order by fx_date
) - interval '1 day' as valid_to,
rate,
-- flag the currently active rate
case
when lead(fx_date) over (
partition by currency order by fx_date
) is null then true
else false
end as is_current
from base
)
select * from with_validity
order by currency, valid_from
This will give uzs data as the following
| currency |
valid_from |
valid_to |
rate_vs_eur |
is_current |
| USD |
2025-01-02 |
2025-01-02 |
1.0312 |
false |
| USD |
2025-01-03 |
2025-01-05 |
1.0345 |
false |
| USD |
2025-01-04 |
null |
1.0821 |
true |
It also makes sense now to add some data checks.
In models, create schema.yml with the following content:
version: 2
models:
- name: stg_fx_rates
columns:
- name: rate_date
tests: [not_null]
- name: currency
tests: [not_null]
- name: rate_vs_eur
tests: [not_null]
- name: fx_rates_with_validity
columns:
- name: currency
tests: [not_null]
- name: valid_from
tests: [not_null]
- name: is_current
tests: [not_null] |
version: 2
models:
- name: stg_fx_rates
columns:
- name: rate_date
tests: [not_null]
- name: currency
tests: [not_null]
- name: rate_vs_eur
tests: [not_null]
- name: fx_rates_with_validity
columns:
- name: currency
tests: [not_null]
- name: valid_from
tests: [not_null]
- name: is_current
tests: [not_null]
We can now run the whole setup:
dbt run
dbt test
dbt docs generate
dbt docs serve
Going forward
Since ECB publishes the latest rate every data at around 16:00 CET the Python job to fetch them should be run each day (apparently after 16:00 CET). Once the raw database is updated you can update the dbt models:
dbt run --select stg_fx_rates fx_rates_with_validity
dbt test --select stg_fx_rates fx_rates_with_validity
In the final part of the series we will add another data market for the latest rates.