Skip to main content

dbt Integration

SignalPilot integrates with dbt to bring your transformation layer context into data investigations. Access model lineage, documentation, test results, and column-level metadata without leaving your notebook.

What dbt Integration Provides

Context TypeWhat You GetExample Use
Model LineageUpstream/downstream dependencies”What feeds into this metric?”
DocumentationModel and column descriptions”What does this column mean?”
Test ResultsData quality test status”Are there any failing tests?”
Column LineageField-level transformations”Where does this field come from?”
FreshnessSource freshness status”When was this data last updated?”

Supported dbt Versions

dbt TypeVersionFeatures
dbt CloudAll tiersFull API access, real-time sync
dbt Core1.0+Manifest/catalog file parsing

Setup: dbt Cloud

1

Get Your dbt Cloud API Token

  1. Log in to dbt Cloud
  2. Go to Account SettingsAPI Access
  3. Generate a new Service Token with these permissions:
    • Metadata API (required)
    • Job triggers (optional, for freshness checks)
  4. Copy the token
Use a service token, not a personal token. Service tokens don’t expire and aren’t tied to individual users.
2

Find Your Account and Project IDs

From your dbt Cloud URL: https://cloud.getdbt.com/deploy/{account_id}/projects/{project_id}
3

Configure SignalPilot

Add to your signalpilot.config.json:
{
  "mcp_servers": {
    "dbt": {
      "type": "dbt-cloud",
      "api_token": "${DBT_CLOUD_API_TOKEN}",
      "account_id": "12345",
      "project_id": "67890"
    }
  }
}
Use environment variables for sensitive values. SignalPilot will read ${VAR_NAME} from your environment.
4

Verify Connection

In a SignalPilot notebook, test with: @dbt show modelsYou should see a list of your dbt models.

Setup: dbt Core

For dbt Core, SignalPilot reads your manifest and catalog files directly.
1

Generate Manifest and Catalog

Run dbt to generate the required artifacts:
dbt docs generate
2

Configure SignalPilot

Add to your signalpilot.config.json:
{
  "mcp_servers": {
    "dbt": {
      "type": "dbt-core",
      "project_dir": "/path/to/your/dbt/project",
      "target_dir": "target"
    }
  }
}

Using dbt Context

Model Lineage

Ask about dependencies to understand data flow:
You: "What models feed into the revenue dashboard?"

SignalPilot: Looking at dbt lineage for revenue_dashboard...

Upstream models:
├── stg_orders (staging)
├── stg_customers (staging)
├── int_order_items (intermediate)
└── fct_revenue (fact)

Test Results

Check data quality before investigating:
You: "Are there any failing dbt tests?"

SignalPilot: Checking dbt test results...

⚠️ 2 tests failing:
1. unique_orders_order_id (FAIL) - Found 47 duplicate order_ids
2. not_null_customers_email (WARN) - Found 12 null emails

Column Lineage

Trace where data comes from:
You: "Where does the mrr column in fct_subscriptions come from?"

SignalPilot: Tracing column lineage...

fct_subscriptions.mrr
└── int_subscription_mrr.mrr_amount
    └── CASE statement combining:
        ├── stg_subscriptions.plan_amount
        ├── stg_discounts.discount_percent
        └── stg_credits.credit_amount

dbt Commands

CommandDescription
@dbt show modelsList all models
@dbt lineage <model>Show upstream/downstream
@dbt describe <model>Show model documentation
@dbt testsShow test results
@dbt freshnessCheck source freshness

Best Practices

1

Document Your Models

Well-documented dbt models make SignalPilot investigations faster. Include column descriptions, especially for business logic.
2

Run Tests Before Investigating

Check @dbt tests before deep investigations. Failing tests might explain data anomalies.
3

Use Lineage for Root Cause

When data looks wrong, trace lineage upstream. The issue often originates in staging or source models.

Context Aggregation

How dbt fits into the MCP architecture

Slack Integration

Combine dbt context with Slack discussions