Skip to main content

Fill null values for metrics

Understanding and implementing strategies to fill null values in metrics is key for accurate analytics. This guide explains fill_nulls_with and join_to_timespine to ensure data completeness, helping end users make more informed decisions and enhancing your dbt workflows.

About null values

You can use fill_nulls_with to replace null values in metrics with a value like zero (or your chosen integer). This ensures every data row shows a numeric value.

This guide explains how to ensure there are no null values in your metrics:

  • Use fill_nulls_with for simple, cumulative, and conversion metrics
  • Use join_to_timespine and fill_nulls_with together for derived and ratio metrics to avoid null values appearing.

Fill null values for simple metrics

For example, if you'd like to handle days with site visits but no leads, you can use fill_nulls_with to set the value for leads to zero on days when there are no conversions.

Let's say you have three metrics:

  • website_visits and leads
  • and a derived metric called leads_to_website_visit that calculates the ratio of leads to site visits.

On the days when there are no conversions, you can set the value for leads to zero by adding the fill_nulls_with parameter to the measure input on the leads metric:

models/metrics/website_vists.yml
metrics:
- name: website_visits
type: simple
type_params:
measure:
name: bookings
- name: leads
type: simple
type_params:
measure:
name: bookings
fill_nulls_with: 0 # This fills null values with zero
- name: leads_to_website_visit
type: derived
type_params:
expr: leads/website_visits
metrics:
- name: leads
- name: website_visits

The website_visits and leads metrics have the following data:

metric_timewebsite_visits
2024-01-0150
2024-01-0237
2024-01-0379
metric_timeleads
2024-01-015
2024-01-038
  • Note that there is no data for 2024-01-02 in the leads metric.

Although there are no days without visits, there are days without leads. After applying fill_nulls_with: 0 to the leads metric, querying these metrics together shows zero for leads on days with no conversions:

metric_timewebsite_visitsleads
2024-01-01505
2024-01-02370
2024-01-03798

Use join_to_timespine for derived and ratio metrics

To ensure you have a complete set of data for every and daily coverage for metrics calculated from other metrics, you can use join_to_timespine to fill null values for derived and ratio metrics. These metrics are built from other metrics (other calculations), not direct measures (raw data), requiring MetricFlow to have an extra subquery layer to render the metric. The subquery nesting is as follows:

  • For derived and ratio metrics, there are three levels of subquery nesting — derived or ratio metric → input metrics → input measures.
  • For simple and cumulative metrics, there are only two levels of subquery nesting — simple or cumulative metric → input measure.

Because coalesce isn't applied to the third, subquery layer for derived or ratio metrics, this means you could still have nulls in the final result set.

  • Note you can use join_to_timespine with metrics that take measure inputs as well if you want to include a row for every date, even if there is no data.

Fill null values for derived and ratio metrics

To fill null values for derived and ratio metrics, you can link them with a time spine to ensure daily data coverage. As mentioned in the previous section, this is because derived and ratio metrics take metrics as inputs instead of measures.

For example, the following structure leaves nulls in the final results (leads_to_website_visit column) because COALESCE isn't applied at the third outer rendering layer for the final metric calculation in derived metrics:

metric_timebookingsleadsleads_to_website_visit
2024-01-01505.1
2024-01-02370null
2024-01-03798.1

To display a zero value for leads_to_website_visit for 2024-01-02, you would join the leads metric to a time spine model to ensure a value for each day. This can be done by adding join_to_timespine to the measure parameter in the leads metric configuration:

models/metrics/leads.yml
- name: leads
type: simple
type_params:
measure:
name: bookings
fill_nulls_with: 0
join_to_timespine: true

Once you do this, if you query the leads metric after the timespine join, there will be a record for each day and any null values will get filled with zero.

metric_timeleadsleads_to_website_visit
2024-01-015.1
2024-01-0200
2024-01-038.1

Now, if you combine the metrics in a derived metric, there will be a zero value for leads_to_website_visit on 2024-01-02 and the final result set will not have any null values.

FAQs

 How to handle null values in derived metrics defined on top of multiple tables
0
Loading