Isolation levels

View as Markdown

An isolation level determines which effects of concurrent transactions are visible to a transaction during its execution.

Supported isolation levels

Materialize accepts the following isolation levels:

Isolation level Behavior in Materialize
Strict Serializable Default. Provides serializability and linearizability.
Serializable Provides serializability but not linearizability.
Read Uncommitted, Read Committed, Repeatable Read Accepted for compatibility; treated as Serializable.

Serializable

Serializable prevents the following three phenomena1:

Phenomenon Description
P1 (Dirty read) A transaction T1 modifies a row; another transaction T2 reads the row before T1 commits. If T1 rolls back, T2 has read a row that was never committed.
P2 (Non-repeatable read) A transaction T1 reads a row; another transaction T2 modifies or deletes that row and commits. If T1 attempts to reread the row, it may see the modified value or discover that the row no longer exists.
P3 (Phantom) A transaction T1 reads a set of rows that match a specific search condition; another transaction T2 inserts rows that also match the condition and commits. If T1 repeats the read with the same search condition, it gets a different set of rows.

Serializable also guarantees that the result of concurrently executing transactions is equivalent to some serial execution of those transactions. A serial execution is one in which each transaction completes before the next one begins. However, Serializable does not guarantee linearizability; that is, it does not guarantee that the serial order matches the real-time order of the transactions. For example, if transaction T1 completes before transaction T2 begins in real time, the result may be equivalent to a serial execution in which T2 executes before T1.

Non-linearizable orderings are more likely to occur when querying indexes and materialized views with large propagation delays. For example, suppose transaction T1 queries table t and is followed in real time by transaction T2, which queries a computationally expensive materialized view mv defined over t. If the two transactions execute sufficiently close together, mv may not yet reflect the latest updates to t that T1 observed, so T2 may not observe all rows visible to T1.

Logical timestamp selection

When using the serializable isolation level, the logical timestamp may be arbitrarily ahead of or behind the system clock. For example, at a wall clock time of 9pm, Materialize may choose to execute a serializable query as of logical time 8:30pm, perhaps because data for 8:30–9pm has not yet arrived. In this scenario, now() would return 9pm, while mz_now() would return 8:30pm.

Strict Serializable

Strict Serializable provides all the guarantees of Serializable isolation and additionally guarantees linearizability. With linearizability, the serial order matches the real-time order of the transactions. For example, if transaction T1 completes before transaction T2 begins in real time, the result is equivalent to a serial execution in which T1 executes before T2.

More concretely, suppose transaction T1 queries table t and is followed in real time by transaction T2, which queries a computationally expensive materialized view mv defined over t. Under Strict Serializable, T2 is guaranteed to observe all rows visible to T1.

The linearizable guarantee applies only to transactions (including single-statement SQL queries, which are implicitly single-statement transactions), not to data written while ingesting from upstream sources.

  • If a piece of data has been fully ingested from an upstream source, it is not guaranteed to appear in the next read transaction. See real-time recency for more details.

  • However, once that data is included in the results of a read transaction, all subsequent read transactions are guaranteed to see it.

Logical timestamp selection

When using the strict serializable isolation level, Materialize attempts to keep the logical timestamp reasonably close to wall clock time. In most cases, the logical timestamp of a query will be within a few seconds of the wall clock time. For example, when executing a strict serializable query at a wall clock time of 9pm, Materialize will choose a logical timestamp within a few seconds of 9pm, even if data for 8:30–9pm has not yet arrived and the query will need to block until the data for 9pm arrives. In this scenario, both now() and mz_now() would return 9pm.

Real-time recency

PREVIEW This feature is in private preview. It is under active development and may have stability or performance issues. It isn't subject to our backwards compatibility guarantees.
To enable this feature in your Materialize region, contact our team.

Materialize offers a form of “end-to-end linearizability” known as real-time recency. When using real-time recency, all client-issued SELECT statements include at least all data visible to Materialize in any external source (i.e., sources created with CREATE SOURCE that use CONNECTIONs, such as Kafka, MySQL, and PostgreSQL sources) after Materialize receives the query. This is what we mean by linearizable––the results are guaranteed to contain all visible data according to physical time.

For example, real-time recency ensures that if you have just performed an INSERT into a PostgreSQL database that Materialize ingests as a source, all of your real-time recency queries will include the just-written data in their results.

Note that real-time recency only guarantees that the results will contain at least the data visible to Materialize when we receive the query. We cannot guarantee that the results will contain only the data visible when Materialize receives the query. For instance, the rate at which Materialize ingests data might include additional data made visible after the timestamp we determined to be “real-time recent.” Another example is that, due to network latency, the timestamp from the external system that we determine to be “real-time recent” might be later (i.e. include more data) than you expected.

Because Materialize waits until it ingests the data from the external system, real-time recency queries can have additional latency. This latency is introduced by both the time it takes us to ingest and commit data from the source and the time spent communicating with it to determine what data it has made available to us (e.g., querying PostgreSQL for the replication slot’s LSN).

Details

  • Real-time recency is only available with sessions running at the strict serializable isolation level.
  • Enable this feature per session using SET real_time_recency = true.
  • Control the timeout for connecting to the external source to determine the timestamp with the real_time_recency_timeout session variable.
  • Real-time recency queries only guarantee visibility of data from external systems (e.g., sources like Kafka, MySQL, and PostgreSQL). Real-time recency queries do not offer any form of guarantee when querying Materialize-local objects, such as LOAD GENERATOR sources or system tables.
  • Each real-time recency query connects to each external source transitively referenced in the query. The more external sources that are referenced, the greater the likelihood of latency caused by the network or ingestion rates.
  • Materialize doesn’t currently offer a mechanism to provide a “lower bound” on the data we consider to be “real-time recent” in an external source. Real-time recency queries return at least all data visible to Materialize when our client connection communicates with the external system.

Isolation levels and query latency

Strict Serializable provides stronger consistency guarantees but may have slower reads than Serializable.

  • Strict Serializable (the default) may need to wait for recent writes to propagate through materialized views and indexes before serving a read, so that the read reflects the real-time order of transactions.

    • Real-time recency (available only with Strict Serializable) introduces additional latency, since Materialize waits to determine and ingest the latest data available in upstream sources before serving the query.
  • Serializable does not wait for writes to propagate. It reads a consistent (but possibly slightly stale) snapshot, which avoids that latency at the cost of linearizability. However, if a consistent snapshot is not available, the query blocks until one becomes available.

Setting isolation level

💡 Tip: Materialize recommends starting with the default Strict Serializable isolation level.

You can set the isolation level using the session-level configuration parameter TRANSACTION_ISOLATION; for example:

SET TRANSACTION_ISOLATION TO 'STRICT SERIALIZABLE';

You can also set the isolation level for an explicit transaction block as part of the BEGIN statement; for example:

BEGIN ISOLATION LEVEL STRICT SERIALIZABLE;
--- ...
--- ...
--- ...
COMMIT;

Learn more

Check out:


  1. Phenomenon descriptions adapted from ISO/IEC 9075-2:1999 (E), §4.32 “SQL-transactions.” ↩︎

Back to top ↑