First value in group

View as Markdown

Overview

The “first value in each group” query pattern returns the first value, according to some ordering, in each group.

Materialize and window functions

For indexed views and materialized views that contain window functions (including aggregate functions used with an OVER clause), when an input record in a partition is added/removed/changed, Materialize recomputes the results from scratch for that partition (instead of using incremental computation).

The PARTITION BY clause of your window function determines your partitions. If PARTITION BY is omitted, all records belong to a single partition (i.e., any record change results in a recomputation from scratch over the whole input).

To avoid performance issues that may arise as the number of records grows, consider rewriting your indexed views and materialized views to use idiomatic Materialize SQL instead of window functions. If your view definitions cannot be rewritten without the window functions and the performance of window functions is insufficient for your use case, please contact our team.

Idiomatic Materialize SQL

Idiomatic Materialize SQL: To find the first value in each group, use MIN() or MAX() aggregate function in a subquery.

Materialize SQL

Use a subquery that uses the MIN() or MAX() aggregate function.

SELECT tableA.fieldA, tableA.fieldB, minmax.Z
 FROM tableA,
 (SELECT fieldA,
    MIN(fieldZ),
    MAX(fieldZ)
 FROM tableA
 GROUP BY fieldA) minmax
WHERE tableA.fieldA = minmax.fieldA
ORDER BY fieldA ... ;
Anti-pattern

Avoid the use of FIRST_VALUE() OVER (PARTITION BY ... ORDER BY ...) window function for first value within groups queries.

-- Anti-pattern. Avoid. --
SELECT fieldA, fieldB,
 FIRST_VALUE(fieldZ) OVER (PARTITION BY fieldA ORDER BY ...),
 FIRST_VALUE(fieldZ) OVER (PARTITION BY fieldA ORDER BY ... DESC)
FROM tableA
ORDER BY fieldA, ...;

Query hints

To further improve the memory usage of the idiomatic Materialize SQL, you can specify a AGGREGATE INPUT GROUP SIZE query hint in the idiomatic Materialize SQL.

SELECT tableA.fieldA, tableA.fieldB, minmax.Z
 FROM tableA,
 (SELECT fieldA,
    MIN(fieldZ),
    MAX(fieldZ)
 FROM tableA
 GROUP BY fieldA
 OPTIONS (AGGREGATE INPUT GROUP SIZE = ...)
 ) minmax
WHERE tableA.fieldA = minmax.fieldA
ORDER BY fieldA ... ;

For more information on setting AGGREGATE INPUT GROUP SIZE, see Optimization.

Examples

NOTE: The example data can be found in the Appendix.

Use MIN() to find the first value

Using idiomatic Materialize SQL, the following example finds the lowest item price in each order and calculates the difference between the price of each item in the order and the lowest price. The example uses a subquery that groups by the order_id and selects MIN(price) to find the lowest price (i.e., first value if ordered by ascending price values).

Materialize SQL
SELECT o.order_id, minmax.lowest_price, o.item, o.price,
  o.price - minmax.lowest_price AS diff_lowest_price
FROM orders_view o,
      (SELECT order_id,
         MIN(price) AS lowest_price
      FROM orders_view
      GROUP BY order_id) minmax
WHERE o.order_id = minmax.order_id
ORDER BY o.order_id, o.item;
Anti-pattern

Avoid the use of FIRST_VALUE() OVER (PARTITION BY ... ORDER BY ...) window function for first value within groups queries.

-- Anti-pattern --
SELECT order_id,
  FIRST_VALUE(price)
    OVER (PARTITION BY order_id ORDER BY price) AS lowest_price,
  item,
  price,
  price - FIRST_VALUE(price)
    OVER (PARTITION BY order_id ORDER BY price) AS diff_lowest_price
FROM orders_view
ORDER BY order_id, item;

Use MAX() to find the first value

Using idiomatic Materialize SQL, the following example finds the highest item price in each order and calculates the difference between the price of each item in the order and the highest price. The example uses a subquery that groups by the order_id and selects MAX(price) to find the highest price (i.e., first value if ordered by descending price values).

Materialize SQL
SELECT o.order_id, minmax.highest_price, o.item, o.price,
  o.price - minmax.highest_price AS diff_highest_price
FROM orders_view o,
      (SELECT order_id,
         MAX(price) AS highest_price
      FROM orders_view
      GROUP BY order_id) minmax
WHERE o.order_id = minmax.order_id
ORDER BY o.order_id, o.item;
Anti-pattern

Avoid the use of FIRST_VALUE() OVER (PARTITION BY ... ORDER BY ...) window function for first value within groups queries.

-- Anti-pattern --
SELECT order_id,
  FIRST_VALUE(price)
    OVER (PARTITION BY order_id ORDER BY price DESC) AS highest_price,
  item,
  price,
  price - FIRST_VALUE(price)
    OVER (PARTITION BY order_id ORDER BY price DESC) AS diff_highest_price
FROM orders_view
ORDER BY order_id, item;

Use MIN() and MAX() to find the first values

Using idiomatic Materialize SQL, the following example finds the lowest and the highest item price in each order and calculates the difference between each item in the order and these prices. The example uses a subquery that groups by the order_id and selects MIN(price) as the lowest price (i.e., first value if ordered by price values) and MAX(price) as the highest price (i.e., first value if ordered by descending price values).

Materialize SQL
SELECT o.order_id, minmax.lowest_price, minmax.highest_price, o.item, o.price,
  o.price - minmax.lowest_price AS diff_lowest_price,
  o.price - minmax.highest_price AS diff_highest_price
FROM orders_view o,
      (SELECT order_id,
         MIN(price) AS lowest_price,
         MAX(price) AS highest_price
      FROM orders_view
      GROUP BY order_id) minmax
WHERE o.order_id = minmax.order_id
ORDER BY o.order_id, o.item;
Anti-pattern

Avoid the use of FIRST_VALUE() OVER (PARTITION BY ... ORDER BY ...) window function for first value within groups queries.

-- Anti-pattern --
SELECT order_id,
  FIRST_VALUE(price)
    OVER (PARTITION BY order_id ORDER BY price) AS lowest_price,
  FIRST_VALUE(price)
    OVER (PARTITION BY order_id ORDER BY price DESC) AS highest_price,
  item,
  price,
  price - FIRST_VALUE(price)
    OVER (PARTITION BY order_id ORDER BY price) AS diff_lowest_price,
  price - FIRST_VALUE(price)
    OVER (PARTITION BY order_id ORDER BY price DESC) AS diff_highest_price
FROM orders_view
ORDER BY order_id, item;

See also

Back to top ↑