Best Practices¶

Recommended patterns for managing dbt Cloud infrastructure with Terraform.

Security¶

Credential Management¶

✅ Use environment variables for all sensitive data:

# .env file (never commit)
export TF_VAR_dbt_token=dbtc_xxxxx
export TF_VAR_environment_credentials='{"analytics_prod": {"credential_type": "databricks", "token": "dapi..."}}'

❌ Never hardcode credentials:

# DON'T DO THIS!
variable "dbt_token" {
  default = "dbtc_xxxxx"  # ❌ Never hardcode — exposed in version control
}

Token Security¶

Use service accounts instead of personal tokens
Rotate tokens every 90 days
Limit permissions to minimum required
Use different tokens for dev/staging/prod
Store in secrets managers (GitHub Secrets, Vault, etc.)

Gitignore¶

Always exclude sensitive files:

.gitignore

# Terraform
*.tfstate
*.tfstate.*
.terraform/
.terraform.lock.hcl
crash.log
override.tf
override.tf.json

# Credentials
.env
.env.local
*.tfvars
!*.tfvars.example
terraform.rc
.terraformrc

# IDE
.vscode/
.idea/
*.swp

Code Organization¶

Directory Structure¶

Recommended layout:

my-dbt-infrastructure/
├── .github/
│   └── workflows/
│       └── deploy.yml
├── terraform/
│   ├── main.tf
│   ├── variables.tf
│   ├── outputs.tf
│   ├── backend.tf
│   └── versions.tf
├── configs/
│   ├── production.yml
│   ├── staging.yml
│   └── development.yml
├── docs/
│   └── README.md
├── .env.example
├── .gitignore
└── README.md

File Naming¶

Use clear, descriptive names:

main.tf - Primary module configuration
variables.tf - Input variable definitions
outputs.tf - Output value definitions
backend.tf - State backend configuration
versions.tf - Provider version constraints

Module Version Pinning¶

Always pin module versions:

```hcl
module "dbt_cloud" {
  source = "git::https://github.com/trouze/terraform-dbtcloud-yaml.git?ref=v1.0.0"

---

## YAML Configuration

### Naming Conventions

Use consistent, descriptive names:

```yaml
project:
  name: "analytics-production"  # ✅ Clear and specific
  # NOT: name: "project1"       # ❌ Vague

  environments:
    - name: "Production"         # ✅ Proper case
      # NOT: name: "prod"        # ❌ Abbreviation

Comments¶

Document non-obvious configurations:

project:
  environments:
    - name: "Production"
      # Using connection_id 1001 for Databricks Unity Catalog
      connection_id: 1001

      jobs:
        - name: "Daily Run"
          # Runs at 6 AM UTC (2 AM ET)
          schedule_hours: [6]

Environment-Specific Values¶

Keep environment configs consistent:

# ✅ Good: Consistent structure across environments
environments:
  - name: "Development"
    type: "development"
    credential:
      token_name: "dev_databricks"
      schema: "dev_analytics"

  - name: "Production"
    type: "deployment"
    credential:
      token_name: "prod_databricks"
      schema: "analytics"

State Management¶

Remote State¶

Always use remote state for teams:

backend.tf

terraform {
  backend "s3" {
    bucket         = "my-company-terraform-state"
    key            = "dbt-cloud/production/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-lock"
  }
}

State Locking¶

Enable locking to prevent concurrent modifications:

# S3 + DynamoDB (AWS)
terraform {
  backend "s3" {
    bucket         = "state-bucket"
    key            = "dbt/terraform.tfstate"
    dynamodb_table = "terraform-locks"
  }
}

# Azure Blob Storage
terraform {
  backend "azurerm" {
    storage_account_name = "tfstate"
    container_name       = "tfstate"
    key                  = "dbt.tfstate"
  }
}

State Backup¶

Always maintain state backups:

# Before major changes
terraform state pull > backup-$(date +%Y%m%d-%H%M%S).tfstate

Testing & Validation¶

Pre-Deployment Checks¶

Always validate before applying:

# 1. Format check
terraform fmt -check -recursive

# 2. Validation
terraform validate

# 3. Plan review
terraform plan -out=tfplan

# 4. Manual review
terraform show tfplan

# 5. Apply only if satisfied
terraform apply tfplan

YAML Validation¶

Use schema validation in IDE:

# Add to top of dbt-config.yml
# yaml-language-server: $schema=https://raw.githubusercontent.com/trouze/terraform-dbtcloud-yaml/main/schemas/v1.json

project:
  name: "my-project"

Automated Testing¶

Run tests in CI/CD:

steps:
  - name: Terraform Format Check
    run: terraform fmt -check -recursive

  - name: Terraform Validate
    run: terraform validate

  - name: YAML Lint
    uses: ibiqlik/action-yamllint@v3

CI/CD Best Practices¶

Branch Protection¶

Protect production branches:

Require pull request reviews
Require status checks to pass
Require up-to-date branches
Include administrators in restrictions

Pull Request Workflow¶

Branch from main: git checkout -b feature/add-prod-job
Make changes to YAML or Terraform
Commit with clear messages
Push and create PR
Review automated plan output
Approve and merge
Automated apply on merge to main

Deployment Strategy¶

Use progressive deployment:

Development → Staging → Production

# Deploy to dev automatically
on:
  push:
    branches: [develop]

# Deploy to staging on PR merge
on:
  pull_request:
    types: [closed]
    branches: [main]

# Deploy to prod manually
when: manual

Performance¶

Parallelism¶

Optimize for multiple projects:

# GitHub Actions
strategy:
  matrix:
    project: [finance, marketing, operations]
  max-parallel: 3

Caching¶

Cache Terraform plugins:

- name: Cache Terraform Plugins
  uses: actions/cache@v4
  with:
    path: ~/.terraform.d/plugin-cache
    key: ${{ runner.os }}-terraform-${{ hashFiles('**/.terraform.lock.hcl') }}

Resource Timeouts¶

Set appropriate timeouts:

jobs:
  - name: "Large Job"
    timeout_seconds: 7200  # 2 hours for complex runs

Monitoring & Observability¶

Terraform Cloud¶

Use Terraform Cloud for visibility:

terraform {
  cloud {
    organization = "my-company"
    workspaces {
      name = "dbt-cloud-production"
    }
  }
}

Notifications¶

Set up alerts for failures:

# GitHub Actions
- name: Notify on Failure
  if: failure()
  uses: slackapi/slack-github-action@v1
  with:
    payload: |
      {
        "text": "Terraform deployment failed: ${{ github.workflow }}"
      }
  env:
    SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}

Logging¶

Capture detailed logs:

- name: Terraform Apply
  run: terraform apply -auto-approve 2>&1 | tee terraform.log

- name: Upload Logs
  if: always()
  uses: actions/upload-artifact@v3
  with:
    name: terraform-logs
    path: terraform.log

Documentation¶

README Template¶

Every project should have:

# dbt Cloud Infrastructure - [Project Name]

## Overview
Brief description of what this manages.

## Prerequisites
- Terraform >= 1.0
- dbt Cloud account
- Required permissions

## Setup
\`\`\`bash
# Clone and configure
git clone <repo>
cp .env.example .env
# Edit .env with credentials
\`\`\`

## Usage
\`\`\`bash
source .env
terraform init
terraform plan
terraform apply
\`\`\`

## Configuration
Link to YAML schema and examples.

## CI/CD
Describe automated deployment process.

## Support
Where to get help.

Inline Documentation¶

Use Terraform comments:

# Configure dbt Cloud provider
# Requires: DBT_ACCOUNT_ID and DBT_API_TOKEN env vars
provider "dbtcloud" {
  account_id = var.dbt_account_id  # Found in dbt Cloud URL
  token      = var.dbt_api_token   # Generate in Profile settings
  host_url   = var.dbt_host_url    # Regional API endpoint
}

Change Management¶

Document major changes:

## Changelog

### 2024-01-15
- Added staging environment
- Updated prod job schedule to 6 AM UTC
- Migrated from GitHub Deploy Key to GitHub App

### 2024-01-10
- Initial production deployment
- Configured Databricks credentials

Common Patterns¶

Multiple Environments¶

environments:
  # Development: Full access, custom branch
  - name: "Development"
    type: "development"
    custom_branch: "develop"
    enable_model_query_history: true

  # Staging: Production-like, limited access
  - name: Staging
    key: staging
    type: deployment
    deployment_type: staging
    dbt_version: "1.9.0"  # Pin to same as prod

  # Production: Locked down, scheduled
  - name: Production
    key: prod
    type: deployment
    deployment_type: production
    dbt_version: "1.9.0"
    protected: true

Job Naming¶

Use consistent, descriptive names:

jobs:
  # ✅ Good
  - name: "Production Daily Full Refresh"
  - name: "Staging CI - Modified Models Only"
  - name: "Development - Ad Hoc Testing"

  # ❌ Bad
  - name: "job1"
  - name: "run"
  - name: "test"

Credential Mapping¶

Use environment_credentials keyed by "{project_key}_{env_key}":

export TF_VAR_environment_credentials='{
  "analytics_dev":     {"credential_type": "databricks", "token": "dapi_dev123"},
  "analytics_staging": {"credential_type": "databricks", "token": "dapi_stg456"},
  "analytics_prod":    {"credential_type": "databricks", "token": "dapi_prd789"}
}'

Checklist¶

Before deploying to production: