CREATE SOURCE: SQL Server

PREVIEW This feature is in private preview. It is under active development and may have stability or performance issues. It isn't subject to our backwards compatibility guarantees.
To enable this feature in your Materialize region, contact our team.
Disambiguation
This page reflects the new syntax which allows Materialize to handle upstream DDL changes, specifically adding or dropping columns, without downtime. For the deprecated syntax, see the old reference page.

Prerequisites

CREATE SOURCE connects Materialize to an external system you want to read data from, and provides details about how to decode and interpret that data.

Materialize supports SQL Server (2016+) as a real-time data source. To connect to a SQL Server database, you first need to tweak its configuration to enable Change Data Capture and SNAPSHOT transaction isolation for the database that you would like to replicate. Then create a connection in Materialize that specifies access and authentication parameters.

Syntax

CREATE SOURCE [IF NOT EXISTS] <src_name>
[IN CLUSTER <cluster_name>]
FROM SQL SERVER CONNECTION <connection_name>
Syntax element Description
<src_name> The name for the source.
IF NOT EXISTS Optional. If specified, do not throw an error if a source with the same name already exists. Instead, issue a notice and skip the source creation.
IN CLUSTER <cluster_name> Optional. The cluster to maintain this source.
CONNECTION <connection_name> The name of the SQL Server connection to use in the source. For details on creating connections, check the CREATE CONNECTION documentation page.

Ingesting data

After a source is created, you can create tables from the source upstream SQL Server database that have Change Data Capture enabled. You can create multiple tables that reference the same table in the source.

See CREATE TABLE FROM SOURCE for details.

Handling table schema changes

The use of the CREATE SOURCE with the new CREATE TABLE FROM SOURCE allows for the handling of certain upstream DDL changes without downtime.

See Guide: Handle upstream schema changes with zero downtime for details.

Supported types

With the new syntax, after a SQL Server source is created, you CREATE TABLE FROM SOURCE to create a corresponding table in Matererialize and start ingesting data.

Materialize natively supports the following SQL Server types:

  • tinyint
  • smallint
  • int
  • bigint
  • real
  • double precision
  • float
  • bit
  • decimal
  • numeric
  • money
  • smallmoney
  • char
  • nchar
  • varchar
  • varchar(max)
  • nvarchar
  • nvarchar(max)
  • sysname
  • binary
  • varbinary
  • json
  • date
  • time
  • smalldatetime
  • datetime
  • datetime2
  • datetimeoffset
  • uniqueidentifier

For more information, including strategies for handling unsupported types, see CREATE TABLE FROM SOURCE.

Monitoring source progress

By default, SQL Server sources expose progress metadata as a subsource that you can use to monitor source ingestion progress. The name of the progress subsource can be specified when creating a source using the EXPOSE PROGRESS AS clause; otherwise, it will be named <src_name>_progress.

The following metadata is available for each source as a progress subsource:

Field Type Details
lsn bytea The upper-bound Log Sequence Number replicated thus far into Materialize.

And can be queried using:

SELECT lsn
FROM <src_name>_progress;

The reported lsn should increase as Materialize consumes new CDC events from the upstream SQL Server database. For more details on monitoring source ingestion progress and debugging related issues, see Troubleshooting.

Example

! Important: Before creating a SQL Server source, you must enable Change Data Capture and SNAPSHOT transaction isolation in the upstream database.

Creating a source

Prerequisite: Creating a connection to SQL Server

First, you must create a connection to your SQL Server database. A connection describes how to connect and authenticate to an external system you want Materialize to read data from.

Once created, a connection is reusable across multiple CREATE SOURCE statements. For more details on creating connections, check the CREATE CONNECTION documentation page.

CREATE SECRET sqlserver_pass AS '<SQL_SERVER_PASSWORD>';

CREATE CONNECTION sqlserver_connection TO SQL SERVER (
    HOST 'instance.foo000.us-west-1.rds.amazonaws.com',
    PORT 1433,
    USER 'materialize',
    PASSWORD SECRET sqlserver_pass,
    DATABASE '<DATABASE_NAME>'
);

If your SQL Server instance is not exposed to the public internet, you can tunnel the connection through and SSH bastion host.

CREATE CONNECTION ssh_connection TO SSH TUNNEL (
    HOST 'bastion-host',
    PORT 22,
    USER 'materialize',
    DATABASE '<DATABASE_NAME>'
);
CREATE CONNECTION sqlserver_connection TO SQL SERVER (
    HOST 'instance.foo000.us-west-1.rds.amazonaws.com',
    SSH TUNNEL ssh_connection,
    DATABASE '<DATABASE_NAME>'
);

For step-by-step instructions on creating SSH tunnel connections and configuring an SSH bastion server to accept connections from Materialize, check this guide.

Creating the source in Materialize

You must enable Change Data Capture, see Enable Change Data Capture SQL Server Instructions.

Once CDC is enabled for all of the tables you wish to create subsources for, you can create a SOURCE in Materialize to begin replicating data!

Create source from the connection we just created

CREATE SOURCE mz_source
    FROM SQL SERVER CONNECTION sqlserver_connection;

After a source is created, you can create a table from the source, referencing specific table(s).

Creates a table in Materialize from the upstream table dbo.items

CREATE TABLE items FROM SOURCE mz_source(REFERENCE dbo.items);
Back to top ↑