Skip to content

Get Started

This guide shows you how to set up a complete Vulcan project on your local machine.

The example project runs locally using a Postgres SQL engine. Vulcan automatically generates all necessary project files and configurations.

To get started, ensure your system meets the prerequisites below, then follow the step-by-step instructions for your operating system.

Prerequisites

Before you begin, make sure you have Docker installed and configured on your system. Follow the instructions below for your operating system.

1. Verify Docker Installation

First, check if Docker Desktop (Mac) or Docker Engine (Linux) is installed and running:

docker --version
docker compose version

If both commands return version numbers, Docker is installed. Make sure Docker Desktop is running (you should see the Docker icon in your menu bar or system tray).

2. Install Docker (if needed)

3. Configure Resources

Ensure Docker Desktop has at least 4GB of RAM allocated. You can adjust this in Docker Desktop settings under Resources → Advanced.

1. Verify Docker Installation

Check if Docker Desktop for Windows is installed and running:

docker --version
docker compose version

If both commands return version numbers, Docker is installed. Make sure Docker Desktop is running (you should see the Docker icon in your system tray).

2. Install Docker (if needed)

If Docker is not installed, download and install Docker Desktop for Windows

3. Configure Resources

Ensure Docker Desktop has at least 4GB of RAM allocated. You can adjust this in Docker Desktop settings under Settings → Resources → Advanced.

Vulcan Setup Locally

Follow these steps to set up Vulcan on your local machine. The setup process will create all necessary infrastructure services and prepare your environment for development.

Download for Mac/Linux

The download includes: Docker Compose files, Makefile, and a comprehensive README

Step 1: Extract and Navigate

Extract the downloaded zip file and open the vulcan-project folder in VS Code or your preferred IDE:

cd vulcan-project

Step 2: Run Setup

Important: Before running setup, ensure Docker Desktop is running on your machine and that you are logged into RubikLabs.

Execute the setup command:

make up

This command starts the full Vulcan stack in one step:

  • statestore (PostgreSQL): Stores Vulcan's internal state, including model definitions, plan information, and execution history. This database persists your semantic model, plans, and tracks materialization state.

  • minio (Object Storage): Stores query results, artifacts, and other data objects that Vulcan generates. This service provides data retrieval and caching for your workflows.

  • vulcan-transpiler: Transpiler API for converting semantic queries to SQL (available at http://localhost:8100)

  • vulcan-api: REST API server for querying your semantic model (available at http://localhost:8000)

  • vulcan-graphql: GraphQL interface for querying your semantic layer (available at http://localhost:3000)

  • vulcan-mysql (optional): MySQL wire protocol access for BI tool connectivity (available at localhost:3307)

  • MySQL proxy: Proxy for BI tools to connect via MySQL protocol (available at localhost:3306)

Note: The setup process typically takes 1-2 minutes to complete. All services are essential for Vulcan's operation.

State Connection Default

By default, you should use Postgres for your state connection. When configuring your config.yaml, set state_connection to use Postgres. This ensures reliable state management and is the recommended approach for most projects.

Verify Services Are Running

Before proceeding, verify that all services are up and running.

Check running containers

Use Docker directly to confirm that all containers are running:

docker ps

You should see containers corresponding to the following services with a status of Up:

  • statestore (PostgreSQL) – State storage service
  • minio – Object storage service
  • vulcan-api – REST API service
  • vulcan-graphql – GraphQL service
  • vulcan-transpiler – Transpiler service

If a container is missing from the list or not in an Up state, it may have stopped or failed to start.

Validate container logs

To inspect logs for any specific service, use:

docker logs <container_name>

For example:

docker logs statestore
docker logs minio
docker logs vulcan-api

Review the logs for errors, crash messages, or failed startup checks.

Once all containers are running properly and logs look healthy, proceed to the next step.

Step 3: Configure Vulcan CLI Access

Create an alias to access the Vulcan CLI. The alias uses an engine-specific Docker image. Postgres is shown by default (recommended for most users). If you're using a different engine, select it from the tabs below:

Automatic Updates

Docker image versions in this section are automatically synchronized with the engine configuration files. When engine image versions are updated, this section is automatically updated as well.

alias vulcan="docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-postgres:0.228.1.10 vulcan"
Image version from Postgres engine configuration

alias vulcan="docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-bigquery:0.228.1.10 vulcan"
Image version from BigQuery engine configuration

alias vulcan="docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-databricks:0.228.1.10 vulcan"
Image version from Databricks engine configuration

alias vulcan="docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-fabric:0.228.1.6 vulcan"
Image version from Fabric engine configuration

alias vulcan="docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-mssql:0.228.1.6 vulcan"
Image version from MSSQL engine configuration

alias vulcan="docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-mysql:0.228.1.6 vulcan"
Image version from MySQL engine configuration

alias vulcan="docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-redshift:0.228.1.6 vulcan"
Image version from Redshift engine configuration

alias vulcan="docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-snowflake:0.228.1.10 vulcan"
Image version from Snowflake engine configuration

alias vulcan="docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-spark:0.228.1.6 vulcan"
Image version from Spark engine configuration

alias vulcan="docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-trino:0.228.1.6 vulcan"
Image version from Trino engine configuration

Note: This alias is temporary and will be lost when you close your shell session. To make it permanent, add this line to your shell configuration file (~/.bashrc for Bash or ~/.zshrc for Zsh), then restart your terminal or run source ~/.zshrc (or source ~/.bashrc).

Once all services are running, you're ready to create your first project!

Download for Windows

The download includes: Docker Compose files, Windows batch scripts, and a comprehensive README

Step 1: Extract and Navigate

Extract the downloaded zip file and navigate to the vulcan-project directory:

cd vulcan-project

Step 2: Run Setup

Important: Before running setup, ensure Docker Desktop for Windows is running and that you are logged into RubikLabs.

Execute the setup script:

setup.bat

This script creates and starts three essential services:

  • statestore (PostgreSQL): Stores Vulcan's internal state, including model definitions, plan information, and execution history. This database persists your semantic model, plans, and tracks materialization state.

  • minio (Object Storage): Stores query results, artifacts, and other data objects that Vulcan generates. This service provides data retrieval and caching for your workflows.

  • minio-init: Initializes MinIO buckets and policies with the correct configuration. This service runs once to set up the storage infrastructure.

Note: These services are essential for Vulcan's operation and must be running before you can use Vulcan. The setup process typically takes 1-2 minutes to complete.

State Connection Default

By default, you should use Postgres for your state connection. When configuring your config.yaml, set state_connection to use Postgres. This ensures reliable state management and is the recommended approach for most projects.

Verify Services Are Running

Before proceeding, verify that all required infrastructure services (engine and storage) are up and running.

Check running containers

Use Docker directly to confirm that all containers are running:

docker ps

You should see containers corresponding to the following services with a status of Up:

  • statestore (PostgreSQL) – State storage service
  • minio – Object storage service
  • warehouse (PostgreSQL) – Data warehouse engine

If a container is missing from the list or not in an Up state, it may have stopped or failed to start.

Validate container logs

To inspect logs for any specific service, use:

docker logs <container_name>

For example:

docker logs statestore
docker logs minio
docker logs warehouse

Review the logs for errors, crash messages, or failed startup checks.

Once all containers are running properly and logs look healthy, proceed to the next step.

Step 3: Configure Vulcan CLI Access

Create a function to access the Vulcan CLI. Open PowerShell and run the following command. Postgres is shown by default (recommended for most users). If you're using a different engine, select it from the tabs below:

function vulcan {
  docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-postgres:0.228.1.10 vulcan $args
}
Image version from Postgres engine configuration

function vulcan {
  docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-bigquery:0.228.1.10 vulcan $args
}
Image version from BigQuery engine configuration

function vulcan {
  docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-databricks:0.228.1.10 vulcan $args
}
Image version from Databricks engine configuration

function vulcan {
  docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-fabric:0.228.1.6 vulcan $args
}
Image version from Fabric engine configuration

function vulcan {
  docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-mssql:0.228.1.6 vulcan $args
}
Image version from MSSQL engine configuration

function vulcan {
  docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-mysql:0.228.1.6 vulcan $args
}
Image version from MySQL engine configuration

function vulcan {
  docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-redshift:0.228.1.6 vulcan $args
}
Image version from Redshift engine configuration

function vulcan {
  docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-snowflake:0.228.1.10 vulcan $args
}
Image version from Snowflake engine configuration

function vulcan {
  docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-spark:0.228.1.6 vulcan $args
}
Image version from Spark engine configuration

function vulcan {
  docker run -it --network=vulcan --rm -v .:/workspace tmdcio/vulcan-trino:0.228.1.6 vulcan $args
}
Image version from Trino engine configuration

Note: This function is temporary and will be lost when you close your PowerShell session. To make it permanent, add it to your PowerShell profile. Run notepad $PROFILE to open your profile file, paste the function, and save.

Step 4: Start API Services

Configure Environment Variables

Before starting the services, open docker\docker-compose.vulcan.yml and replace the following placeholders with your actual values:

Variable Placeholder Description
DATAOS_RUN_AS_USER <your-dataos-username> Your DataOS user ID
DATAOS_RUN_AS_APIKEY <your-dataos-api-key> Your DataOS API key
HEIMDALL_URL <your-dataos-context> Your DataOS context URL (e.g., https://my-context.dataos.app/heimdall)

Generate SSL Certificates for MySQL Wire Protocol (Optional)

If you plan to use the Vulcan MySQL wire protocol service (vulcan-mysql), SSL/TLS certificates are required. Generate them before starting the service:

mkdir docker\ssl
openssl req -x509 -nodes -days 365 -newkey rsa:2048 ^
  -keyout docker\ssl\server.key -out docker\ssl\server.crt ^
  -subj "/CN=vulcan-mysql"

Start the services:

start-vulcan-api.bat

This command starts the following services:

  • vulcan-api: A REST API server for querying your semantic model (available at http://localhost:8000)

  • vulcan-graphql: A GraphQL interface for querying your semantic layer (available at http://localhost:3000)

  • vulcan-mysql (optional): MySQL wire protocol access to Vulcan for BI tool connectivity (available at localhost:3307)

Once these services are running, you're ready to create your first project!

Create Your First Project

Now that your environment is set up, let's create your first Vulcan project. This section walks you through initializing a project, verifying the setup, running your first plan, and querying your data.

Step 1: Initialize Your Project

Initialize a new Vulcan project: Learn more about init

vulcan init

When prompted:

  • Choose DEFAULT as the project type

  • Select Postgres as your SQL engine

This command creates a complete project structure with 7 directories:

  • models/ - Contains .sql and .py files for your data models

  • seeds/ - CSV files for static datasets

  • audits/ - Write logic to assert data quality and block downstream models if checks fail

  • tests/ - Test files for validating your model logic

  • macros/ - Write custom macros for reusable SQL patterns

  • checks/ - Write data quality checks

  • semantics/ - Semantic layer definitions (measures, dimensions, etc.)

Configure Your Connection

After initialization, verify your config.yaml has the correct connection values. Replace the connection values (host, port, database, user, password) with values that match your actual database setup. For Docker setups, use the service names (warehouse, statestore) as hostnames. For local or remote databases, use the actual hostname or IP address.

Step 2: Verify Your Setup

Check your project configuration and connection status: Learn more about info

vulcan info

This command displays:

  • Connection status to your database

  • Number of models, macros, and other project components

  • Project configuration details

Important: Verify that the setup is correct before proceeding to run plans. If you see any errors, check the Troubleshooting section below.

Step 3: Create and Apply Your First Plan

Generate a plan for your models: Learn more about plan

vulcan plan

This command performs three key actions:

  1. Validates your models and creates the necessary database objects (tables, views, etc.)
  2. Calculates which data intervals need to be backfilled based on your model's start date and cron schedule
  3. Prompts you to apply the plan

When prompted, enter y to apply the plan and backfill your models with historical data.

Note: The backfill process may take a few minutes depending on the amount of historical data to process.

Step 4: Query Your Models

Execute SQL queries against your models: Learn more about fetchdf

vulcan fetchdf "select * from schema.model_name"

This command executes a SQL query and returns the results as a pandas DataFrame.

Step 5: Query Using Semantic Layer

Use Vulcan's semantic layer to query your data: Learn more about transpile

vulcan transpile --format sql "SELECT MEASURE(measure_name) FROM model"

This command transpiles your semantic query into SQL that can be executed against your data warehouse. The semantic layer provides a business-friendly interface for querying your data models.

Step 1: Initialize Your Project

Initialize a new Vulcan project: Learn more about init

vulcan init

When prompted:

  • Choose DEFAULT as the project type

  • Select Postgres as your SQL engine

This command creates a complete project structure with 7 directories:

  • models/ - Contains .sql and .py files for your data models

  • seeds/ - CSV files for static datasets

  • audits/ - Write logic to assert data quality and block downstream models if checks fail

  • tests/ - Test files for validating your model logic

  • macros/ - Write custom macros for reusable SQL patterns

  • checks/ - Write data quality checks

  • semantics/ - Semantic layer definitions (measures, dimensions, etc.)

Configure Your Connection

After initialization, verify your config.yaml has the correct connection values. Replace the connection values (host, port, database, user, password) with values that match your actual database setup. For Docker setups, use the service names (warehouse, statestore) as hostnames. For local or remote databases, use the actual hostname or IP address.

Step 2: Verify Your Setup

Check your project configuration and connection status: Learn more about info

vulcan info

This command displays:

  • Connection status to your database

  • Number of models, macros, and other project components

  • Project configuration details

Important: Verify that the setup is correct before proceeding to run plans. If you see any errors, check the Troubleshooting section below.

Step 3: Create and Apply Your First Plan

Generate a plan for your models: Learn more about plan

vulcan plan

This command performs three key actions:

  1. Validates your models and creates the necessary database objects (tables, views, etc.)
  2. Calculates which data intervals need to be backfilled based on your model's start date and cron schedule
  3. Prompts you to apply the plan

When prompted, enter y to apply the plan and backfill your models with historical data.

Note: The backfill process may take a few minutes depending on the amount of historical data to process.

Step 4: Query Your Models

Execute SQL queries against your models: Learn more about fetchdf

vulcan fetchdf "select * from schema.model_name"

This command executes a SQL query and returns the results as a pandas DataFrame.

Step 5: Query Using Semantic Layer

Use Vulcan's semantic layer to query your data: Learn more about transpile

vulcan transpile --format sql "SELECT MEASURE(measure_name) FROM model"

This command transpiles your semantic query into SQL that can be executed against your data warehouse. The semantic layer provides a business-friendly interface for querying your data models.

Stopping Services

When you're done working with Vulcan, you can stop the services to free up system resources. Use the commands below based on your operating system.

Stop All Services

To stop all running services:

make down

Stop and Clean Up (Warning: This deletes all data)

To stop all services and remove volumes (this will delete all data):

make all-clean      # Stop and remove volumes (this will delete all data)

Stop Individual Service Groups

You can also stop specific service groups:

make vulcan-down      # Stop only Vulcan API services
make infra-down       # Stop infrastructure services (statestore, minio)
make transpiler-down  # Stop transpiler services
make proxy-down       # Stop MySQL proxy

Stop All Services

To stop all running services:

stop-all.bat           # Stop all services

Stop and Clean Up (Warning: This deletes all data)

To stop all services and remove volumes (this will delete all data):

clean.bat              # Stop and remove volumes (this will delete all data)

Stop Individual Services

To stop only the Vulcan API services:

vulcan-down.bat        # Stop only Vulcan API services

Troubleshooting

If you encounter any issues during setup or while using Vulcan, refer to the solutions below.

Common Issues and Solutions

Services Won't Start

If services fail to start, ensure Docker Desktop is running with at least 4GB RAM allocated. You can check and adjust this in Docker Desktop settings:

  • Mac: Docker Desktop → Settings → Resources → Advanced

  • Windows: Docker Desktop → Settings → Resources → Advanced

Invalid Connection Config Error

If you see an error like:

Error: Invalid 'postgres' connection config:
  Field 'host': Input should be a valid string
  Field 'user': Input should be a valid string
  Field 'password': Input should be a valid string
  Field 'port': Input should be a valid integer
  Field 'database': Input should be a valid string

This means your config.yaml file is missing or incomplete. You need to create or update your config.yaml file with proper gateway configuration before running vulcan info or other Vulcan commands.

Solution:

  1. If you haven't initialized your project yet, run vulcan init first. This creates a config.yaml file with the correct structure.

  2. If you already have a project, ensure your config.yaml file includes a gateways section with all required connection fields. Here's a minimal example for Postgres:

gateways:
  default:
    connection:
      type: postgres
      host: warehouse
      port: 5432
      database: warehouse
      user: vulcan
      password: vulcan
    state_connection:
      type: postgres
      host: statestore
      port: 5432
      database: statestore
      user: vulcan
      password: vulcan

default_gateway: default

model_defaults:
  dialect: postgres

Connection Values

Important: Replace the connection values (host, port, database, user, password) with values that match your actual database setup. For Docker setups, use the service names (warehouse, statestore) as hostnames. For local or remote databases, use the actual hostname or IP address.

See the Configuration Overview for detailed information about gateway configuration.

Network Errors

If you encounter network-related errors, ensure the vulcan Docker network exists:

Check if the network exists:

docker network ls | grep vulcan
If it doesn't exist, create it:
docker network create vulcan

Check if the network exists:

docker network ls | grep vulcan
If it doesn't exist, create it:
docker network create vulcan

Port Conflicts

If you see errors about ports already being in use, one of the required ports (5431, 5433, 9000, 9001, or 8000) is likely occupied by another application. You have two options:

  1. Stop the conflicting application using that port
  2. Modify the port mappings in the Docker Compose files (docker/docker-compose.infra.yml and docker/docker-compose.warehouse.yml)

Can't Connect to Services

If you're unable to connect to Vulcan services, verify that all required services are running:

docker compose -f docker/docker-compose.infra.yml ps
docker compose -f docker/docker-compose.warehouse.yml ps

All services should show as "Up" or "running". If any service shows as "Exited" or "Stopped", check the logs:

docker compose -f docker/docker-compose.infra.yml logs

Access MinIO Console

You can access the MinIO console to manage your object storage:

  • URL: http://localhost:9001

  • Username: admin

  • Password: password

The MinIO console allows you to browse buckets, upload files, and manage storage policies.

Permission denied

Create a .logs folder manually and change the permission

chmod -R a+w .

Next Steps

You've set up Vulcan and created your first project. Here are recommended next steps: