Self-Managed Deployments
Overview
Self-Managed Materialize deployments on Kubernetes consist of several layers of components that work together to provide a fully functional database environment. Understanding these components and how they interact is essential for deploying, managing, and troubleshooting your Self-Managed Materialize.
This page provides an overview of the core architectural components in a Self-Managed deployment, from the infrastructure level (Helm chart) down to the application level (clusters and replicas).
Architecture layers
A Self-Managed Materialize deployment is organized into the following layers:
| Layer | Component | Description |
|---|---|---|
| Infrastructure | Helm Chart | Package manager component that bootstraps the Kubernetes deployment |
| Orchestration | Materialize Operator | Kubernetes operator that manages Materialize instances |
| Database | Materialize Instance | The Materialize database instance itself |
| Compute | Clusters and Replicas | Isolated compute resources for workloads |
Helm chart
The Helm chart is the entry point for deploying Materialize in a self-managed Kubernetes environment. It serves as a package manager component that defines and deploys the Materialize Operator.
Working with the Helm chart
You interact with the Helm chart through standard Helm commands. For example:
-
To add the Materialize Helm chart repository:
helm repo add materialize https://materializeinc.github.io/materialize -
To update the repository index:
helm repo update materialize -
To install the Materialize Helm chart and deploy the Materialize Operator and other resources:
helm install materialize materialize/materialize-operator -
To upgrade the the Materialize Helm chart (and the Materialize Operator and other resources):
helm upgrade materialize materialize/materialize-operator -
To uninstall the Helm chart (and the Materialize Operator and other resources):
helm uninstall materialize
What gets installed
helm install materialize materialize/materialize-operator
When you install the the Materialize Helm Chart, it:
- Deploys the Materialize Operator as a Kubernetes deployment.
- Creates necessary cluster-wide resources (CRDs, RBAC roles, service accounts).
- Configures operator settings and permissions.
Once installed, the Materialize Operator handles the deployment and management of Materialize instances.
Materialize Operator
The Materialize Operator (implemented as orchestratord) is a Kubernetes operator that automates the deployment and lifecycle management of Materialize instances. It implements the Kubernetes operator pattern to extend Kubernetes with domain-specific knowledge about Materialize.
Managed resources
The operator watches for Materialize custom resources and creates/manages all the Kubernetes resources required to run a Materialize instance, including:
- Namespaces: Isolated Kubernetes namespaces for each instance
- Services: Network services for connecting to Materialize
- Network Policies: Network isolation and security rules
- Certificates: TLS certificates for secure connections
- ConfigMaps and Secrets: Configuration and sensitive data
- Deployments: These support the
balancerdandconsolepod used as the ingress layer for Materialize. - StatefulSets:
environmentdandclusterdwhich are the database control plane and compute resources respectively.
Configuration
For configuration options for the Materialize Operator, see the Materialize Operator Configuration page.
Materialize Instance
A Materialize instance is the actual database that you connect to and interact with. Each instance is an isolated Materialize deployment with its own data, configuration, and compute resources.
Components
When you create a Materialize instance, the operator deploys three core components as Kubernetes resources:
-
environmentd: The main database control plane, deployed as a StatefulSet.environmentdruns as a Kubernetes pod and is the primary component of a Materialize instance. It houses the control plane and contains:- Adapter: The SQL interface that handles client connections, query parsing, and planning
- Storage Controller: Maintains durable metadata for storage
- Compute Controller: Orchestrates compute resources and manages system state
On startup,
environmentdwill create several built-in clusters.When you connect to Materialize with a SQL client (e.g.,
psql), you’re connecting toenvironmentd. -
balancerd: A pgwire and http proxy used to connect to environmentd, deployed as a Deployment.
-
console: Web-based administration interface, deployed as a Deployment.
Instance responsibilities
A Materialize instance manages:
- SQL objects: Sources, views, materialized views, indexes, sinks
- Schemas and databases: Logical organization of objects
- User connections: SQL client connections and authentication
- Catalog metadata: System information about all objects and configuration
- Compute orchestration: Coordination of work across clusters and replicas
Deploying with the operator
To deploy Materialize instances with the operator, create and apply Materialize custom resources definitions(CRDs). For a full list of fields available for the Materialize CR, see Materialize CRD Field Descriptions.
apiVersion: materialize.cloud/v1alpha1
kind: Materialize
metadata:
name: 12345678-1234-1234-1234-123456789012
namespace: materialize-environment
spec:
environmentdImageRef: materialize/environmentd:v26.1.1
# ... additional fields omitted for brevity
When you first apply the Materialize custom resource, the operator automatically creates all required Kubernetes resources.
Modifying the custom resource
To modify a custom resource, update the CRD with your changes, including the
requestRollout field with a new UUID value. When you apply the CRD, the
operator will roll out the changes.
requestRollout UUID, the operator
watches for updates but does not roll out the changes.
For a full list of fields available for the Materialize CR, see Materialize CRD Field Descriptions.
See also:
Connecting to an instance
Once deployed, you interact with a Materialize instance through the Materialize Console or standard PostgreSQL-compatible tools and drivers:
# Connect with psql
psql "postgres://materialize@<host>:6875/materialize"
Once connected, you can issue SQL commands to create sources, define views, run queries, and manage the database:
-- Create a source
CREATE SOURCE my_source FROM KAFKA ...;
-- Create a materialized view
CREATE MATERIALIZED VIEW my_view AS
SELECT ... FROM my_source ...;
-- Query the view
SELECT * FROM my_view;
Clusters and Replicas
Clusters are isolated pools of compute resources that execute workloads in Materialize. They provide resource isolation and fault tolerance for your data processing pipelines.
For a comprehensive overview of clusters in Materialize, see the Clusters concept page.
Cluster architecture
- Clusters: Logical groupings of compute resources dedicated to specific workloads (sources, sinks, indexes, materialized views, queries)
- Replicas: Physical instantiations of a cluster’s compute resources, deployed as Kubernetes StatefulSets
Each replica contains identical compute resources and processes the same data independently, providing fault tolerance and high availability.
Kubernetes resources
When you create a cluster with one or more replicas in Materialize, the instance coordinates with the operator to create:
- One or more StatefulSet resources (one per replica)
- Pods within each StatefulSet that execute the actual compute workload
- Persistent volumes (if configured) for scratch disk space
For example:
-- Create a cluster with 2 replicas
CREATE CLUSTER my_cluster SIZE = '100cc', REPLICATION FACTOR = 2;
This creates two separate StatefulSets in Kubernetes, each running compute processes.
Managing clusters
You interact with clusters primarily through SQL:
-- Create a cluster
CREATE CLUSTER ingest_cluster SIZE = '50cc', REPLICATION FACTOR = 1;
-- Use the previous cluster for a source
CREATE SOURCE my_source
IN CLUSTER ingest_cluster
FROM KAFKA ...;
-- Create a cluster for materialized views
CREATE CLUSTER compute_cluster SIZE = '100cc', REPLICATION FACTOR = 2;
-- Use the previous cluster for a materialized view
CREATE MATERIALIZED VIEW my_view
IN CLUSTER compute_cluster AS
SELECT ... FROM my_source ...;
-- Resize a cluster
ALTER CLUSTER compute_cluster SET (SIZE = '200cc');
Materialize handles the underlying Kubernetes resource creation and management automatically.
Workflow
The following outlines the workflow process, summarizing how the various components work together:
-
Install the Helm chart: This deploys the Materialize Operator to your Kubernetes cluster.
-
Create a Materialize instance: Apply a Materialize custom resource. The operator detects this and creates all necessary Kubernetes resources, including the
environmentd,balancerd, andconsolepods. -
Connect to the instance: Use the Materialize Console on port 8080 to connecto to the
consoleservice endpoint or SQL client on port 6875 to connect to thebalancerdservice endpoint.If authentication is enabled, you must first connect to the Materialize Console and set up users.
-
Create clusters: Issue SQL commands to create clusters. Materialize coordinates with the operator to provision StatefulSets for replicas.
-
Run your workloads: Create sources, materialized views, indexes, and sinks on your clusters.
Terraform Modules
To help you get started, Materialize provides Terraform modules.
These modules are intended for evaluation/demonstration purposes and for serving as a template when building your own production deployment. The modules should not be directly relied upon for production deployments: future releases of the modules will contain breaking changes. Instead, to use as a starting point for your own production deployment, either:
-
Fork the repo and pin to a specific version; or
-
Use the code as a reference when developing your own deployment.
Unified Terraform Modules
Materialize provides a unified Terraform module, which provides concrete examples and an opinionated model for deploying Materialize.
| Module | Description |
|---|---|
| Amazon Web Services (AWS) | An example Terraform module for deploying Materialize on AWS. See Install on AWS for detailed instructions usage. |
| Azure | An example Terraform module for deploying Materialize on Azure. See Install on Azure for detailed instructions usage. |
| Google Cloud Platform (GCP) | An example Terraform module for deploying Materialize on GCP. See Install on GCP for detailed instructions usage. |
Legacy Terraform Modules
| Sample Module | Description |
|---|---|
| terraform-helm-materialize (Legacy) | A sample Terraform module for installing the Materialize Helm chart into a Kubernetes cluster. |
| Materialize on AWS (Legacy) | A sample Terraform module for deploying Materialize on AWS Cloud Platform with all required infrastructure components. See Install on AWS (Legacy) for an example usage. |
| Materialize on Azure (Legacy) | A sample Terraform module for deploying Materialize on Azure with all required infrastructure components. See Install on Azure for an example usage. |
| Materialize on GCP (Legacy) | A sample Terraform module for deploying Materialize on Google Cloud Platform (GCP) with all required infrastructure components. See Install on GCP for an example usage. |
Relationship to Materialize concepts
Self-managed deployments implement the same core Materialize concepts as the Cloud offering:
- Clusters: Identical behavior, but backed by Kubernetes StatefulSets
- Sources: Same functionality for ingesting data
- Views: Same query semantics and incremental maintenance
- Indexes: Same in-memory query acceleration
- Sinks: Same data egress capabilities
The Self-Managed deployment model adds the Kubernetes infrastructure layer (Helm chart and operator) but does not change how you interact with Materialize at the SQL level.