# Explain plan operators Materialize offers several output formats for [`EXPLAIN PLAN`](/sql/explain-plan/) and debugging. LIR plans as rendered in [`mz_introspection.mz_lir_mapping`](../../reference/system-catalog/mz_introspection/#mz_lir_mapping) are deliberately succinct, while the plans in other formats give more detail. The decorrelated and optimized plans from `EXPLAIN DECORRELATED PLAN FOR ...`, `EXPLAIN LOCALLY OPTIMIZED PLAN FOR ...`, and `EXPLAIN OPTIMIZED PLAN FOR ...` are in a mid-level representation that is closer to LIR than SQL. The raw plans from `EXPLAIN RAW PLAN FOR ...` are closer to SQL (and therefore less indicative of how the query will actually run). **In fully optimized physical (LIR) plans (Default):** The following table lists the operators that are available in the LIR plan. - For those operators that require memory to maintain intermediate state, **Uses memory** is marked with **Yes**. - For those operators that expand the data size (either rows or columns), **Can increase data size** is marked with **Yes**.| Operator | Description | Example | | --- | --- | --- | | **Constant** | Always produces the same collection of rows. **Can increase data size:** No **Uses memory:** No | →Constant (2 rows) | | **Stream, Arranged, Index Lookup, Read** |

Produces rows from either an existing relation (source/view/materialized view/table) or from a previous CTE in the same plan. A parent Fused Map/Filter/Project operator can combine with this operator.

There are four types of Get.

Stream indicates that the results are not arranged in memory and will be streamed directly.
Arranged indicates that the results are arranged in memory.
Index Lookup indicates the results will be looked up in an existing [arrangement]((/get-started/arrangements/#arrangements).
Read indicates that the results are unarranged, and will be processed as they arrive.

**Can increase data size:** No **Uses memory:** No | Arranged materialize.public.t | | **Map/Filter/Project** |

Computes new columns (maps), filters columns, and projects away columns. Works row-by-row. Maps and filters will be printed, but projects will not.

These may be marked as Fused Map/Filter/Project, which means they will combine with the operator beneath them to run more efficiently.

**Can increase data size:** Each row may have more data, from the Map. Each row may also have less data, from the Project. There may be fewer rows, from the Filter. **Uses memory:** No |

→Map/Filter/Project   Filter: (#0{a} < 7)   Map: (#0{a} + #1{b})

| | **Table Function** |

Appends the result of some (one-to-many) table function to each row in the input.

A parent Fused Table Function unnest_list operator will fuse with its child GroupAggregate operator. Fusing these operator is part of how we efficiently compile window functions from SQL to dataflows.

A parent Fused Map/Filter/Project can combine with this operator.

**Can increase data size:** Depends on the table function used. **Uses memory:** No |

→Table Function generate_series(#0{a}, #1{b}, 1)   Input key: (#0{a})

| | **Differential Join, Delta Join** |

Both join operators indicate the join ordering selected.

Returns combinations of rows from each input whenever some equality predicates are true.

Joins will indicate the join order of their children, starting from 0. For example, Differential Join %1 » %0 will join its second child into its first.

The two joins differ in performance characteristics.

**Can increase data size:** Depends on the join order and facts about the joined collections. **Uses memory:** ✅ Uses memory for 3-way or more differential joins. |

→Differential Join %1 » %0   Join stage %0: Lookup key #0{a} in %0

| | **GroupAggregate** |

Groups the input rows by some scalar expressions, reduces each group using some aggregate functions, and produces rows containing the group key and aggregate outputs.

There are five types of GroupAggregate, ordered by increasing complexity:

Distinct GroupAggregate corresponds to the SQL DISTINCT operator.
Accumulable GroupAggregate (e.g., SUM, COUNT) corresponds to several easy to implement aggregations that can be executed simultaneously and efficiently.
Hierarchical GroupAggregate (e.g., MIN, MAX) corresponds to an aggregation requiring a tower of arrangements. These can be either monotonic (more efficient) or bucketed. These may benefit from a hint; see mz_introspection.mz_expected_group_size_advice. These may either be bucketed or monotonic (more efficient). These may consolidate their output, which will increase memory usage.
Collated Multi-GroupAggregate corresponds to an arbitrary mix of reductions of different types, which will be performed separately and then joined together.
Non-incremental GroupAggregate (e.g., window functions, list_agg) corresponds to a single non-incremental aggregation. These are the most computationally intensive reductions.

A parent Fused Map/Filter/Project can combine with this operator.

**Can increase data size:** No **Uses memory:** ✅ Distinct and Accumulable aggregates use a moderate amount of memory (proportional to twice the output size). MIN and MAX aggregates can use significantly more memory. This can be improved by including group size hints in the query, see mz_introspection.mz_expected_group_size_advice. Non-incremental aggregates use memory proportional to the input + output size. Collated aggregates use memory that is the sum of their constituents, plus some memory for the join at the end. |

→Accumulable GroupAggregate   Simple aggregates: count(*)   Post-process Map/Filter/Project     Filter: (#0 > 1)

| | **TopK** |

Groups the input rows, sorts them according to some ordering, and returns at most K rows at some offset from the top of the list, where K is some (possibly computed) limit.

There are three types of TopK. Two are special cased for monotonic inputs (i.e., inputs which never retract data).

Monotonic Top1.
Monotonic TopK, which may give an expression indicating the limit.
Non-monotonic TopK, a generic TopK plan.

Each version of the TopK operator may include grouping, ordering, and limit directives.

**Can increase data size:** No **Uses memory:** ✅ Monotonic Top1 and Monotonic TopK use a moderate amount of memory. Non-monotonic TopK uses significantly more memory as the operator can significantly overestimate the group sizes. Consult mz_introspection.mz_expected_group_size_advice. |

→Consolidating Monotonic TopK   Order By #1 asc nulls_last, #0 desc nulls_first   Limit 5

| | **Negate Diffs** | Negates the row counts of the input. This is usually used in combination with union to remove rows from the other union input. **Can increase data size:** No **Uses memory:** No | →Negate Diffs | | **Threshold Diffs** | Removes any rows with negative counts. **Can increase data size:** No **Uses memory:** ✅ Uses memory proportional to the input and output size, twice. | →Threshold Diffs | | **Union** | Combines its inputs into a unified output, emitting one row for each row on any input. (Corresponds to UNION ALL rather than UNION/UNION DISTINCT.) **Can increase data size:** No **Uses memory:** ✅ A Consolidating Union will make moderate use of memory, particularly at hydration time. A Union that is not Consolidating will not consume memory. | →Consolidating Union | | **Arrange** | Indicates a point that will become an arrangement in the dataflow engine, i.e., it will consume memory to cache results. **Can increase data size:** No **Uses memory:** ✅ Uses memory proportional to the input size. Note that in the LIR / physical plan, Arrange/ArrangeBy almost always means that an arrangement will actually be created. (This is in contrast to the “optimized” plan, where an ArrangeBy being present in the plan often does not mean that an arrangement will actually be created.) |

→Arrange     Keys: 1 arrangement available, plus raw stream       Arrangement 0: #0

| | **Unarranged Raw Stream** | Indicates a point where data will be streamed (even if it is somehow already arranged). **Can increase data size:** No **Uses memory:** No | →Unarranged Raw Stream | | **With ... Return ...** | Introduces CTEs, i.e., makes it possible for sub-plans to be consumed multiple times by downstream operators. **Can increase data size:** No **Uses memory:** No | See Reading plans | **Notes:** - **Can increase data size:** Specifies whether the operator can increase the data size (can be the number of rows or the number of columns). - **Uses memory:** Specifies whether the operator use memory to maintain state for its inputs. **In decorrelated and optimized plans:** The following table lists the operators that are available in the optimized plan. - For those operators that require memory to maintain intermediate state, **Uses memory** is marked with **Yes**. - For those operators that expand the data size (either rows or columns), **Can increase data size** is marked with **Yes**.| Operator | Description | Example | | --- | --- | --- | | **Constant** | Always produces the same collection of rows. **Can increase data size:** No **Uses memory:** No |

Constant - ((1, 2) x 2) - (3, 4)

| | **Get** | Produces rows from either an existing relation (source/view/materialized view/table) or from a previous CTE in the same plan. **Can increase data size:** No **Uses memory:** No | Get materialize.public.ordered | | **Project** | Produces a subset of the columns in the input rows. See also column numbering. **Can increase data size:** No **Uses memory:** No | Project (#2, #3) | | **Map** | Appends the results of some scalar expressions to each row in the input. **Can increase data size:** Each row has more data (i.e., longer rows but same number of rows). **Uses memory:** No | Map (((#1 * 10000000dec) / #2) * 1000dec) | | **FlatMap** | Appends the result of some (one-to-many) table function to each row in the input. **Can increase data size:** Depends on the table function used. **Uses memory:** No | FlatMap jsonb_foreach(#3) | | **Filter** | Removes rows of the input for which some scalar predicates return false. **Can increase data size:** No **Uses memory:** No | Filter (#20 < #21) | | **Join** | Returns combinations of rows from each input whenever some equality predicates are true. **Can increase data size:** Depends on the join order and facts about the joined collections. **Uses memory:** ✅ The Join operator itself uses memory only for type=differential with more than 2 inputs. However, Join operators need arrangements on their inputs (shown by the ArrangeBy operator). These arrangements use memory proportional to the input sizes. If an input has an appropriate index, then the arrangement of the index will be reused. | Join on=(#1 = #2) type=delta | | **CrossJoin** | An alias for a Join with an empty predicate (emits all combinations). Note that not all cross joins are marked as CrossJoin: In a join with more than 2 inputs, it can happen that there is a cross join between some of the inputs. You can recognize this case by ArrangeBy operators having empty keys, i.e., ArrangeBy keys=[[]]. **Can increase data size:** Cartesian product of the inputs (\|N\| x \|M\|). **Uses memory:** ✅ Uses memory for 3-way or more differential joins. | CrossJoin type=differential | | **Reduce** | Groups the input rows by some scalar expressions, reduces each group using some aggregate functions, and produces rows containing the group key and aggregate outputs. **Can increase data size:** No **Uses memory:** ✅ SUM, COUNT, and most other aggregations use a moderate amount of memory (proportional either to twice the output size or to input size + output size). MIN and MAX aggregates can use significantly more memory. This can be improved by including group size hints in the query, see mz_introspection.mz_expected_group_size_advice. | Reduce group_by=[#0] aggregates=[max((#0 * #1))] | | **Distinct** | Alias for a Reduce with an empty aggregate list. **Can increase data size:** No **Uses memory:** ✅ Uses memory proportional to twice the output size. | Distinct | | **TopK** | Groups the input rows by some scalar expressions, sorts each group using the group key, removes the top offset rows in each group, and returns the next limit rows. **Can increase data size:** No **Uses memory:** ✅ Can use significant amount as the operator can significantly overestimate the group sizes. Consult mz_introspection.mz_expected_group_size_advice. | TopK order_by=[#1 asc nulls_last, #0 desc nulls_first] limit=5 | | **Negate** | Negates the row counts of the input. This is usually used in combination with union to remove rows from the other union input. **Can increase data size:** No **Uses memory:** No | Negate | | **Threshold** | Removes any rows with negative counts. **Can increase data size:** No **Uses memory:** ✅ Uses memory proportional to the input and output size, twice. | Threshold | | **Union** | Sums the counts of each row of all inputs. (Corresponds to UNION ALL rather than UNION/UNION DISTINCT.) **Can increase data size:** No **Uses memory:** ✅ Moderate use of memory. Some union operators force consolidation, which results in a memory spike, largely at hydration time. | Union | | **ArrangeBy** | Indicates a point that will become an arrangement in the dataflow engine (each keys element will be a different arrangement). Note that if an appropriate index already exists on the input or the output of the previous operator is already arranged with a key that is also requested here, then this operator will just pass on that existing arrangement instead of creating a new one. **Can increase data size:** No **Uses memory:** ✅ Depends. If arrangements need to be created, they use memory proportional to the input size. | ArrangeBy keys=[[#0]] | | **With ... Return ...** | Introduces CTEs, i.e., makes it possible for sub-plans to be consumed multiple times by downstream operators. **Can increase data size:** No **Uses memory:** No | See Reading plans | **Notes:** - **Can increase data size:** Specifies whether the operator can increase the data size (can be the number of rows or the number of columns). - **Uses memory:** Specifies whether the operator use memory to maintain state for its inputs. **In raw plans:** The following table lists the operators that are available in the raw plan. - For those operators that require memory to maintain intermediate state, **Uses memory** is marked with **Yes**. - For those operators that expand the data size (either rows or columns), **Can increase data size** is marked with **Yes**.| Operator | Description | Example | | --- | --- | --- | | **Constant** | Always produces the same collection of rows. **Can increase data size:** No **Uses memory:** No |

Constant - ((1, 2) x 2) - (3, 4)

| | **Get** | Produces rows from either an existing relation (source/view/materialized view/table) or from a previous CTE in the same plan. **Can increase data size:** No **Uses memory:** No | Get materialize.public.ordered | | **Project** | Produces a subset of the columns in the input rows. See also column numbering. **Can increase data size:** No **Uses memory:** No | Project (#2, #3) | | **Map** | Appends the results of some scalar expressions to each row in the input. **Can increase data size:** Each row has more data (i.e., longer rows but same number of rows). **Uses memory:** No | Map (((#1 * 10000000dec) / #2) * 1000dec) | | **CallTable** | Appends the result of some (one-to-many) table function to each row in the input. **Can increase data size:** Depends on the table function used. **Uses memory:** No | CallTable generate_series(1, 7, 1) | | **Filter** | Removes rows of the input for which some scalar predicates return false. **Can increase data size:** No **Uses memory:** No | Filter (#20 < #21) | | **~Join** | Performs one of INNER / LEFT / RIGHT / FULL OUTER / CROSS join on the two inputs, using the given predicate. **Can increase data size:** For CrossJoins, Cartesian product of the inputs (\|N\| x \|M\|). Note that, in many cases, a join that shows up as a cross join in the RAW PLAN will actually be turned into an inner join in the OPTIMIZED PLAN, by making use of an equality WHERE condition. For other join types, depends on the join order and facts about the joined collections. **Uses memory:** ✅ Uses memory proportional to the input sizes, unless the inputs have appropriate indexes. Certain joins with more than 2 inputs use additional memory, see details in the optimized plan. | InnerJoin (#0 = #2) | | **Reduce** | Groups the input rows by some scalar expressions, reduces each group using some aggregate functions, and produces rows containing the group key and aggregate outputs. In the case where the group key is empty and the input is empty, returns a single row with the aggregate functions applied to the empty input collection. **Can increase data size:** No **Uses memory:** ✅ SUM, COUNT, and most other aggregations use a moderate amount of memory (proportional either to twice the output size or to input size + output size). MIN and MAX aggregates can use significantly more memory. This can be improved by including group size hints in the query, see mz_introspection.mz_expected_group_size_advice. | Reduce group_by=[#0] aggregates=[max((#0 * #1))] | | **Distinct** | Removes duplicate copies of input rows. **Can increase data size:** No **Uses memory:** ✅ Uses memory proportional to twice the output size. | Distinct | | **TopK** | Groups the input rows by some scalar expressions, sorts each group using the group key, removes the top offset rows in each group, and returns the next limit rows. **Can increase data size:** No **Uses memory:** ✅ Can use significant amount as the operator can significantly overestimate the group sizes. Consult mz_introspection.mz_expected_group_size_advice. | TopK order_by=[#1 asc nulls_last, #0 desc nulls_first] limit=5 | | **Negate** | Negates the row counts of the input. This is usually used in combination with union to remove rows from the other union input. **Can increase data size:** No **Uses memory:** No | Negate | | **Threshold** | Removes any rows with negative counts. **Can increase data size:** No **Uses memory:** ✅ Uses memory proportional to the input and output size, twice. | Threshold | | **Union** | Sums the counts of each row of all inputs. (Corresponds to UNION ALL rather than UNION/UNION DISTINCT.) **Can increase data size:** No **Uses memory:** ✅ Moderate use of memory. Some union operators force consolidation, which results in a memory spike, largely at hydration time. | Union | | **With ... Return ...** | Introduces CTEs, i.e., makes it possible for sub-plans to be consumed multiple times by downstream operators. **Can increase data size:** No **Uses memory:** No | See Reading plans | **Notes:** - **Can increase data size:** Specifies whether the operator can increase the data size (can be the number of rows or the number of columns). - **Uses memory:** Specifies whether the operator use memory to maintain state for its inputs. Operators are sometimes marked as `Fused ...`. This indicates that the operator is fused with its input, i.e., the operator below it. That is, if you see a `Fused X` operator above a `Y` operator: ``` →Fused X →Y ``` Then the `X` and `Y` operators will be combined into a single, more efficient operator. See also: - [`EXPLAIN PLAn`](/sql/explain-plan/)