Custom materializations¶
Vulcan comes with a variety of model kinds that handle the most common ways to evaluate and materialize your data transformations. But what if you need something different?
Sometimes, your specific use case doesn't quite fit any of the built-in model kinds. Maybe you need custom logic for how data gets inserted, or you want to implement a materialization strategy that's unique to your workflow. That's where custom materializations come in, they let you write your own Python code to control exactly how your models get materialized.
Advanced Feature
Custom materializations are powerful, but they're also advanced. Before diving in, make sure you've exhausted all other options. If an existing model kind can solve your problem, we want to improve our docs; if a built-in kind is almost what you need, we might be able to enhance it for everyone.
What is a materialization?¶
A materialization is the "how" behind your model execution. When Vulcan runs a model, it needs to figure out how to get that data into your database. The materialization is the set of methods that handle executing your transformation logic and managing the resulting data.
Some materializations are straightforward. For example, a FULL model kind completely replaces the table each time it runs, so its materialization is essentially just CREATE OR REPLACE TABLE [name] AS [your query].
Other materializations are more complex. An INCREMENTAL_BY_TIME_RANGE model needs to figure out which time intervals to process, query only that data, and then merge it into the existing table. That requires more logic.
The materialization logic can also vary by SQL engine. PostgreSQL doesn't support CREATE OR REPLACE TABLE, so FULL models on Postgres use DROP then CREATE instead. Vulcan handles all these engine-specific details for built-in model kinds, but with custom materializations, you're in control.
How custom materializations work¶
Custom materializations are like creating your own model kind. You define them in Python, give them a name, and then reference that name in your model's MODEL block. They can accept configuration arguments that you pass in from your model definition.
Here's what every custom materialization needs:
-
Python code: Written as a Python class
-
Base class: Must inherit from Vulcan's
CustomMaterializationclass -
Insert method: At minimum, you need to implement the
insertmethod -
Auto-loading: Vulcan automatically discovers materializations in your
materializations/directory
You can also:
-
Override other methods from
MaterializableStrategyorEngineAdapterclasses -
Execute arbitrary SQL using the engine adapter
-
Perform Python processing with Pandas or other libraries (though for most cases, you'd want that logic in a Python model instead)
Vulcan will automatically load any Python files in your project's materializations/ directory. Or, if you prefer, you can package your materialization as a Python package and install it like any other dependency.
Creating a custom materialization¶
To create a custom materialization, just add a .py file to your project's materializations/ folder. Vulcan will automatically import all Python modules in this folder when your project loads, so your materializations will be ready to use.
Your materialization class needs to inherit from CustomMaterialization and implement at least the insert method. Let's look at some examples to see how this works.
Simple example¶
Here's a complete example that shows custom insert logic with some helpful logging:
Let's break down what's happening here:
| Component | What It Does |
|---|---|
NAME |
The identifier you'll use in your model definition (like simple_custom) |
table_name |
The target table where your data will be inserted |
query_or_df |
Either a SQL query string or a DataFrame (works with Pandas, PySpark, Snowpark) |
model |
The full model definition object, gives you access to all model properties |
is_first_insert |
True if this is the first time inserting data for this model version |
render_kwargs |
Dictionary of arguments used to render the model query |
self.adapter |
The engine adapter, your interface to execute SQL and interact with the database |
Minimal example¶
If you just want a simple full-refresh materialization, here's the minimal version:
That's it! This will completely replace the table contents each time the model runs, just like a FULL model kind.
Controlling table creation and deletion¶
You can also customize how tables and views are created and deleted by overriding the create and delete methods:
This gives you full control over the lifecycle of your data objects.
Using a custom materialization¶
Once you've created your materialization, using it is straightforward. In your model definition, set the kind to CUSTOM and specify the materialization name (the NAME from your Python class):
Passing properties to the materialization¶
You can pass configuration to your materialization using materialization_properties. This is useful when you want to customize behavior per model:
Then access these properties in your materialization code via model.custom_materialization_properties:
This lets you create flexible materializations that can adapt to different use cases.
Extending CustomKind¶
Warning
This is advanced territory. You're working with Vulcan's internals here, so there's extra complexity involved. If the basic custom materialization approach works for you, stick with that. Only dive into this if you really need the extra control.
Most of the time, the standard custom materialization approach is all you need. But sometimes you want tighter integration with Vulcan's internals, maybe you need to validate custom properties before any database connections are made, or you want to leverage functionality that depends on specific properties being present.
In those cases, you can create a subclass of CustomKind that Vulcan will use instead of the default. When your project loads, Vulcan will detect your subclass and use it instead of the standard CustomKind.
Creating a custom kind¶
Here's how you'd create a custom kind that validates a primary_key property:
Using the custom kind in a model¶
Use it in your model like this:
Linking to your materialization¶
To connect your custom kind to your materialization, specify it as a generic type parameter:
When Vulcan loads your materialization, it inspects the type signature for generic parameters that are subclasses of CustomKind. If it finds one, it uses your subclass when building model.kind instead of the default.
Why would you want this? Two main benefits:
-
Early validation: Your
primary_keyvalidation happens at load time, not evaluation time. Issues get caught before you even create a plan. -
Type safety:
model.kindresolves to your custom kind object, so you get access to extra properties without additional validation.
Sharing custom materializations¶
Once you've built a custom materialization, you'll probably want to use it across multiple projects. You have a couple of options.
Copying files¶
The simplest approach is to copy the materialization code into each project's materializations/ directory. It works, but it's not the most maintainable approach, you'll need to manually update each copy when you make changes.
If you go this route, we strongly recommend keeping the materialization code in version control and setting up a reliable way to notify users when updates are available.
Python packaging¶
A more robust approach is to package your materialization as a Python package. This is especially useful if you're using Airflow or other external schedulers where the scheduler cluster doesn't have direct access to your project's materializations/ folder.
Package your materialization using setuptools entrypoints:
Once the package is installed, Vulcan automatically discovers and loads your materialization from the entrypoint list. No manual configuration needed!