Idiomatic Materialize SQL chart

View as Markdown

Materialize follows the SQL standard (SQL-92) implementation and strives for compatibility with the PostgreSQL dialect. However, for some use cases, Materialize provides its own idiomatic query patterns that can provide better performance.

General

Query Patterns

Idiomatic Materialize SQL Pattern
ANY() Equi-join condition

If no duplicates exist in the unnested field: Use a Common Table Expression (CTE) to UNNEST() the array of values and perform the equi-join on the unnested values.

-- array_field contains no duplicates.--

WITH my_expanded_values AS
(SELECT UNNEST(array_field) AS fieldZ FROM tableB)
SELECT a.fieldA, ...
FROM tableA a
JOIN my_expanded_values t ON a.fieldZ = t.fieldZ
;

Duplicates may exist in the unnested field: Use a Common Table Expression (CTE) to DISTINCT UNNEST() the array of values and perform the equi-join on the unnested values.

-- array_field may contain duplicates.--

WITH my_expanded_values AS
(SELECT DISTINCT UNNEST(array_field) AS fieldZ FROM tableB)
SELECT a.fieldA, ...
FROM tableA a
JOIN my_expanded_values t ON a.fieldZ = t.fieldZ
;
NOT IN (<subquery>) predicate

Rewrite to NOT EXISTS with a correlated subquery.

SELECT t1.*
FROM t1
WHERE NOT EXISTS (SELECT 1 FROM t2 WHERE t2.a = t1.a)
;

Filter out NULLs on both sides of the NOT IN.

SELECT t1.*
FROM t1
WHERE t1.a IS NOT NULL
  AND t1.a NOT IN (SELECT t2.a FROM t2 WHERE t2.a IS NOT NULL)
;
mz_now() with date/time operators Rewrite the query expression; specifically, move the operation to the other side of the comparison.
mz_now() with disjunctions (OR) in materialized/indexed view definitions and SUBSCRIBE statements

Rewrite as UNION ALL or UNION, deduplicating as necessary:

  • In some cases, you may need to modify the conditions to deduplicate results when using UNION ALL. For example, you might add the negation of one input’s condition to the other as a conjunction.

  • In some cases, using UNION instead of UNION ALL may suffice if the inputs do not contain other duplicates that need to be retained.

Examples

Example
ANY() Equi-join condition

If no duplicates in the unnested field

-- sales_items.items contains no duplicates. --

WITH individual_sales_items AS
(SELECT unnest(items) as item, week_of FROM sales_items)
SELECT s.week_of, o.order_id, o.item, o.quantity
FROM orders o
JOIN individual_sales_items s ON o.item = s.item
WHERE date_trunc('week', o.order_date) = s.week_of
ORDER BY s.week_of, o.order_id, o.item, o.quantity
;

To omit duplicates that may exist in the unnested field

-- sales_items.items may contains duplicates --

WITH individual_sales_items AS
(SELECT DISTINCT unnest(items) as item, week_of FROM sales_items)
SELECT s.week_of, o.order_id, o.item, o.quantity
FROM orders o
JOIN individual_sales_items s ON o.item = s.item
WHERE date_trunc('week', o.order_date) = s.week_of
ORDER BY s.week_of, o.order_id, o.item, o.quantity
;
NOT IN (<subquery>) predicate

Because the subquery uses UNNEST() on a column of the outer-correlated row, factor the UNNEST() into an uncorrelated Common Table Expression (CTE) first.

Rewrite to NOT EXISTS with a CTE for the UNNEST()

WITH this_weeks_sales AS (
  SELECT unnest(items) AS sale_item
  FROM sales_items
  WHERE week_of = date_trunc('week', current_timestamp)
)
SELECT i.item, i.price
FROM items i
WHERE NOT EXISTS (
  SELECT 1 FROM this_weeks_sales s WHERE s.sale_item = i.item
)
ORDER BY i.item
;

Filter out NULLs with a CTE for the UNNEST()

WITH this_weeks_sales AS (
  SELECT unnest(items) AS sale_item
  FROM sales_items
  WHERE week_of = date_trunc('week', current_timestamp)
)
SELECT i.item, i.price
FROM items i
WHERE i.item IS NOT NULL
  AND i.item NOT IN (
    SELECT sale_item FROM this_weeks_sales WHERE sale_item IS NOT NULL
  )
ORDER BY i.item
;
mz_now() with date/time operators
SELECT * from orders
WHERE mz_now() > order_date + INTERVAL '5min'
;
mz_now() with disjunctions (OR) in materialized/indexed view definitions and SUBSCRIBE statements

Rewrite as UNION ALL with possible duplicates

CREATE MATERIALIZED VIEW forecast_completed_orders_duplicates_possible AS
SELECT item, quantity, status from orders
WHERE status = 'Shipped'
UNION ALL
SELECT item, quantity, status from orders
WHERE order_date + interval '30' minutes >= mz_now()
;

Rewrite as UNION ALL that avoids duplicates across queries

CREATE MATERIALIZED VIEW forecast_completed_orders_deduplicated_union_all AS
SELECT item, quantity, status from orders
WHERE status = 'Shipped'
UNION ALL
SELECT item, quantity, status from orders
WHERE order_date + interval '30' minutes >= mz_now()
AND status != 'Shipped' -- Deduplicate by excluding those with status 'Shipped'
;

Rewrite as UNION to deduplicate any and all duplicated results

CREATE MATERIALIZED VIEW forecast_completed_orders_deduplicated_results AS
SELECT item, quantity, status from orders
WHERE status = 'Shipped'
UNION
SELECT item, quantity, status from orders
WHERE order_date + interval '30' minutes >= mz_now()
;

Window Functions

Materialize and window functions

For indexed views and materialized views that contain window functions (including aggregate functions used with an OVER clause), when an input record in a partition is added/removed/changed, Materialize recomputes the results from scratch for that partition (instead of using incremental computation).

The PARTITION BY clause of your window function determines your partitions. If PARTITION BY is omitted, all records belong to a single partition (i.e., any record change results in a recomputation from scratch over the whole input).

To avoid performance issues that may arise as the number of records grows, consider rewriting your indexed views and materialized views to use idiomatic Materialize SQL instead of window functions. If your view definitions cannot be rewritten without the window functions and the performance of window functions is insufficient for your use case, please contact our team.

Query Patterns

Idiomatic Materialize SQL Pattern
Top-K over partition
(K >= 1)

Use a subquery to SELECT DISTINCT on the grouping key (e.g., fieldA), and perform a LATERAL join (by the grouping key fieldA) with another subquery that specifies the ordering (e.g., fieldZ [ASC|DESC]) and the limit K.

SELECT fieldA, fieldB, ...
FROM (SELECT DISTINCT fieldA FROM tableA) grp,
     LATERAL (SELECT fieldB, ... , fieldZ FROM tableA
        WHERE fieldA = grp.fieldA
        ORDER BY fieldZ ... LIMIT K)   -- K is a number >= 1
ORDER BY fieldA, fieldZ ... ;
Top-K over partition
(K = 1)
SELECT DISTINCT ON(fieldA) fieldA, fieldB, ...
FROM tableA
ORDER BY fieldA, fieldZ ... ;
First value over partition
order by …

Use a subquery that uses the MIN() or MAX() aggregate function.

SELECT tableA.fieldA, tableA.fieldB, minmax.Z
 FROM tableA,
 (SELECT fieldA,
    MIN(fieldZ),
    MAX(fieldZ)
 FROM tableA
 GROUP BY fieldA) minmax
WHERE tableA.fieldA = minmax.fieldA
ORDER BY fieldA ... ;
Last value over partition
order by …
range between unbounded preceding
and unbounded following

Use a subquery that uses the MIN() or MAX() aggregate function.

SELECT tableA.fieldA, tableA.fieldB, minmax.Z
 FROM tableA,
 (SELECT fieldA,
    MAX(fieldZ),
    MIN(fieldZ)
 FROM tableA
 GROUP BY fieldA) minmax
WHERE tableA.fieldA = minmax.fieldA
ORDER BY fieldA ... ;
Lag over (order by) whose ordering can be represented by some equality condition.

To exclude the first row since it has no previous row

-- Excludes the first row in the results --
SELECT t1.fieldA, t2.fieldB as previous_row_value
FROM tableA t1, tableA t2
WHERE t1.fieldA = t2.fieldA + ... -- or some other operand
ORDER BY fieldA;

To include the first row

-- Includes the first row in the results --
SELECT t1.fieldA, t2.fieldB as previous_row_value
FROM tableA t1
LEFT JOIN tableA t2
ON t1.fieldA = t2.fieldA + ... -- or some other operand
ORDER BY fieldA;
Lead over (order by) whose ordering can be represented by some equality condition.

To exclude the last row since it has no next row

-- Excludes the last row in the results --
SELECT t1.fieldA, t2.fieldB as next_row_value
FROM tableA t1, tableA t2
WHERE t1.fieldA = t2.fieldA - ... -- or some other operand
ORDER BY fieldA;

To include the last row

-- Includes the last row in the results --
SELECT t1.fieldA, t2.fieldB as next_row_value
FROM tableA t1
LEFT JOIN tableA t2
ON t1.fieldA = t2.fieldA - ... -- or some other operand
ORDER BY fieldA;

Examples

Example
Top-K over partition
(K >= 1)
SELECT order_id, item, subtotal
FROM (SELECT DISTINCT order_id FROM orders_view) grp,
     LATERAL (SELECT item, subtotal FROM orders_view
        WHERE order_id = grp.order_id
        ORDER BY subtotal DESC LIMIT 3)
ORDER BY order_id, subtotal DESC;
Top-K over partition
(K = 1)
SELECT DISTINCT ON(order_id) order_id, item, subtotal
FROM orders_view
ORDER BY order_id, subtotal DESC;
First value over partition
order by …
SELECT o.order_id, minmax.lowest_price, minmax.highest_price, o.item, o.price,
  o.price - minmax.lowest_price AS diff_lowest_price,
  o.price - minmax.highest_price AS diff_highest_price
FROM orders_view o,
      (SELECT order_id,
         MIN(price) AS lowest_price,
         MAX(price) AS highest_price
      FROM orders_view
      GROUP BY order_id) minmax
WHERE o.order_id = minmax.order_id
ORDER BY o.order_id, o.item;
Last value over partition
order by …
range between unbounded preceding
and unbounded following
SELECT o.order_id, minmax.lowest_price, minmax.highest_price, o.item, o.price,
  o.price - minmax.lowest_price AS diff_lowest_price,
  o.price - minmax.highest_price AS diff_highest_price
FROM orders_view o,
      (SELECT order_id,
         MIN(price) AS lowest_price,
         MAX(price) AS highest_price
      FROM orders_view
      GROUP BY order_id) minmax
WHERE o.order_id = minmax.order_id
ORDER BY o.order_id, o.item;
Lag over (order by) whose ordering can be represented by some equality condition.

To exclude the first row since it has no previous row

SELECT o1.order_date, o1.daily_total,
    o2.daily_total as previous_daily_total
FROM orders_daily_totals o1, orders_daily_totals o2
WHERE o1.order_date = o2.order_date + INTERVAL '1' DAY
ORDER BY order_date;

To include the first row

SELECT o1.order_date, o1.daily_total,
    o2.daily_total as previous_daily_total
FROM orders_daily_totals o1
LEFT JOIN orders_daily_totals o2
ON o1.order_date = o2.order_date + INTERVAL '1' DAY
ORDER BY order_date;
Lead over (order by) whose ordering can be represented by some equality condition.

To exclude the last row since it has no next row

SELECT o1.order_date, o1.daily_total,
    o2.daily_total as next_daily_total
FROM orders_daily_totals o1, orders_daily_totals o2
WHERE o1.order_date = o2.order_date - INTERVAL '1' DAY
ORDER BY order_date;

To include the last row

SELECT o1.order_date, o1.daily_total,
    o2.daily_total as next_daily_total
FROM orders_daily_totals o1
LEFT JOIN orders_daily_totals o2
ON o1.order_date = o2.order_date - INTERVAL '1' DAY
ORDER BY order_date;

See also

Back to top ↑