About¶
What is a data product?¶
A data product is context, packaged as a first-class asset. It bundles schema, semantics, ownership, lineage, quality, and freshness into one governed unit. You build it once. Dashboards, applications, and AI agents all consume the same definition, the same number, the same trust.
The alternative is what most teams have today: raw tables plus tribal knowledge. That works for an analyst who knows which revenue table is canonical. It breaks the moment an AI agent answers the same question, picks a stale table, and is confidently wrong.
Why it lives above the engine¶
If the contract lives inside the warehouse, it's locked to the warehouse. Real enterprises run Snowflake for analytics, Postgres for operations, a lakehouse for ML. The data product has to sit above the bytes so the same contract reaches every consumer, regardless of where the data physically is.
Where Vulcan fits¶
Vulcan builds data products above the engine. Bring Postgres, Snowflake, Spark, Trino, BigQuery, Databricks, or Redshift etc. Vulcan runs against the engine you already pay for. No data movement, until needed.
A data product moves through four phases: Input/Output, Transformation, Quality, Semantics. Vulcan is one stack for all four.
Input/Output is the engine you choose: point a single config file at it and Vulcan runs against it directly.
Transformation is where you write models in SQL or Python, or mix both in the same project. vulcan plan shows the full impact of every change before it touches the warehouse, and vulcan run ships it on the cron you set.
Quality is enforced in-house, not bolted on after the fact: the linter catches errors before the warehouse does, assertions block bad rows at write time, built-in data quality to watch for anomalies and drift, and tests validate your logic locally with no warehouse cost.
Semantics is where you define dimensions, measures, segments, and metrics once. Vulcan validates them against your models and generates REST, GraphQL, and SQL-wire APIs automatically, so the same definitions power your dashboards, notebooks, and application code.
graph LR
subgraph VT ["Vulcan Timeline →"]
direction LR
Engine["<b>Engine</b><br/>Postgres · Snowflake · Spark · Trino · BigQuery · Databricks"] -.-> Config
Config["<b>Config</b>"] -.-> Linter["<b>Linter</b><br/>Code Safety"]
Config -.-> Notify["<b>Notifications</b><br/>Fires across lifecycle"]
Macros["<b>Macros</b><br/>Variables · Functions"] -.-> Model
Tests["<b>Tests</b><br/>Logic Validation"] -.-> Model
Signals["<b>Signals</b><br/>Readiness Gates"] -.-> Model
Config --> Model["<b>MODEL</b><br/>SQL · Python Transformations"]
Model --> Audits
Audits{"<b>Assertions</b> <br> Blocking Rules"} -->|pass| Checks
Audits -->|pass| Profiles
Audits -->|fail| Stop(("STOP"))
Checks["<b>dq</b><br/>Data Quality"] --> Sem
Profiles["<b>Profiling</b><br/>Understanding"] --> Sem
Sem["<b>Semantics</b><br/>Dimensions · Measures · Segments · Metrics"] --> REST["<b>REST API</b>"]
Sem --> GraphQL["<b>GraphQL API</b>"]
Sem --> MySQL["<b>SQL API</b>"]
end
style VT fill:none,stroke:none
style Config fill:#fafafa,stroke:#9e9e9e,stroke-width:1px,stroke-dasharray: 5 5
style Engine fill:#ffffff,stroke:#9e9e9e,stroke-width:1px,stroke-dasharray: 5 5
style Linter fill:#e8eaf6,stroke:#3f51b5,stroke-width:1px
style Macros fill:#e8eaf6,stroke:#3f51b5,stroke-width:1px
style Tests fill:#e3f2fd,stroke:#1976d2,stroke-width:1px
style Signals fill:#f3e5f5,stroke:#7b1fa2,stroke-width:1px
style Model fill:#c8e6c9,stroke:#2e7d32,stroke-width:3px
style Audits fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px
style Stop fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px
style Checks fill:#fff9c4,stroke:#fbc02d,stroke-width:1px
style Profiles fill:#fff9c4,stroke:#fbc02d,stroke-width:1px
style Sem fill:#fff9c4,stroke:#fbc02d,stroke-width:2px
style Notify fill:#fff3e0,stroke:#f57c00,stroke-width:1px,stroke-dasharray: 5 5
style REST fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px
style GraphQL fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px
style MySQL fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px
Get started¶
The quickstart guide gets the Vulcan CLI running in Docker on your machine, connects it to your engine, and materializes your first models with vulcan plan. From there, the project scaffold gives you audits/, models/dq/, tests/, models/semantics/, and models/metrics/ folders ready to fill in.