Skip to main content

dbt Integration

SignalPilot integrates with dbt to bring your transformation layer context into data investigations. Access model lineage, documentation, test results, and column-level metadata without leaving your notebook.

What dbt Integration Provides

Context TypeWhat You GetExample Use
Model LineageUpstream/downstream dependencies”What feeds into this metric?”
DocumentationModel and column descriptions”What does this column mean?”
Test ResultsData quality test status”Are there any failing tests?”
Column LineageField-level transformations”Where does this field come from?”
FreshnessSource freshness status”When was this data last updated?”

Supported dbt Versions

dbt TypeVersionFeatures
dbt CloudAll tiersFull API access, real-time sync
dbt Core1.0+Manifest/catalog file parsing

Setup: dbt Cloud

1

Get Your dbt Cloud API Token

  1. Log in to dbt Cloud
  2. Go to Account SettingsAPI Access
  3. Generate a new Service Token with these permissions:
    • Metadata API (required)
    • Job triggers (optional, for freshness checks)
  4. Copy the token
Use a service token, not a personal token. Service tokens don’t expire and aren’t tied to individual users.
2

Find Your Account and Project IDs

From your dbt Cloud URL: https://cloud.getdbt.com/deploy/{account_id}/projects/{project_id}
3

Configure SignalPilot

Add to your signalpilot.config.json:
{
  "mcp_servers": {
    "dbt": {
      "type": "dbt-cloud",
      "api_token": "${DBT_CLOUD_API_TOKEN}",
      "account_id": "12345",
      "project_id": "67890"
    }
  }
}
Use environment variables for sensitive values. SignalPilot will read ${VAR_NAME} from your environment.
4

Verify Connection

In a SignalPilot notebook, test with: @dbt show modelsYou should see a list of your dbt models.

Setup: dbt Core

For dbt Core, SignalPilot reads your manifest and catalog files directly.
1

Generate Manifest and Catalog

Run dbt to generate the required artifacts:
dbt docs generate
2

Configure SignalPilot

Add to your signalpilot.config.json:
{
  "mcp_servers": {
    "dbt": {
      "type": "dbt-core",
      "project_dir": "/path/to/your/dbt/project",
      "target_dir": "target"
    }
  }
}

Using dbt Context

Model Lineage

Ask about dependencies to understand data flow:
You: "What models feed into the revenue dashboard?"

SignalPilot: Looking at dbt lineage for revenue_dashboard...

Upstream models:
├── stg_orders (staging)
├── stg_customers (staging)
├── int_order_items (intermediate)
└── fct_revenue (fact)

Test Results

Check data quality before investigating:
You: "Are there any failing dbt tests?"

SignalPilot: Checking dbt test results...

⚠️ 2 tests failing:
1. unique_orders_order_id (FAIL) - Found 47 duplicate order_ids
2. not_null_customers_email (WARN) - Found 12 null emails

Column Lineage

Trace where data comes from:
You: "Where does the mrr column in fct_subscriptions come from?"

SignalPilot: Tracing column lineage...

fct_subscriptions.mrr
└── int_subscription_mrr.mrr_amount
    └── CASE statement combining:
        ├── stg_subscriptions.plan_amount
        ├── stg_discounts.discount_percent
        └── stg_credits.credit_amount

dbt Commands

CommandDescription
@dbt show modelsList all models
@dbt lineage <model>Show upstream/downstream
@dbt describe <model>Show model documentation
@dbt testsShow test results
@dbt freshnessCheck source freshness

Best Practices

1

Document Your Models

Well-documented dbt models make SignalPilot investigations faster. Include column descriptions, especially for business logic.
2

Run Tests Before Investigating

Check @dbt tests before deep investigations. Failing tests might explain data anomalies.
3

Use Lineage for Root Cause

When data looks wrong, trace lineage upstream. The issue often originates in staging or source models.