CLI Reference¶
The Lakehouse Plumber command-line interface provides project creation,
validation, code generation, state inspection and more. All commands are
implemented with click so you can use the
usual --help flags.
lhp¶
LakehousePlumber - Generate Lakeflow pipelines from YAML configs.
Usage
lhp [OPTIONS] COMMAND [ARGS]...
Options
- --version¶
Show the version and exit.
- -v, --verbose¶
Enable verbose logging
deps¶
Analyze and visualize pipeline dependencies for orchestration planning.
Usage
lhp deps [OPTIONS]
Options
- -f, --format <format>¶
Output format(s) to generate (dot=GraphViz, json=structured data, text=readable report, job=orchestration job)
- Options:
dot | json | text | job | all
- -o, --output <output>¶
Output directory (defaults to .lhp/dependencies/)
- -p, --pipeline <pipeline>¶
Analyze specific pipeline only
- -j, --job-name <job_name>¶
Custom name for generated orchestration job (only used with job format)
- -jc, --job-config <job_config>¶
Custom job config file path (relative to project root, defaults to templates/bundle/job_config.yaml)
- -b, --bundle-output¶
Save job file to resources/ directory for Databricks bundle integration
- -v, --verbose¶
Enable verbose output
generate¶
Generate DLT pipeline code
Usage
lhp generate [OPTIONS]
Options
- -e, --env <env>¶
Required Environment
- -p, --pipeline <pipeline>¶
Specific pipeline to generate
- -o, --output <output>¶
Output directory (defaults to generated/{env})
- --dry-run¶
Preview without generating files
- --no-cleanup¶
Disable cleanup of generated files when source YAML files are removed.
- -f, --force¶
Force regeneration of all files, even if unchanged
- --no-bundle¶
Disable bundle support even if databricks.yml exists
- --include-tests¶
Include test actions in generation (skipped by default for faster builds)
- -pc, --pipeline-config <pipeline_config>¶
Custom pipeline config file path (relative to project root)
info¶
Display project information and statistics.
Usage
lhp info [OPTIONS]
init¶
Initialize a new LakehousePlumber project in the current directory.
PROJECT_NAME is used for template rendering (e.g. bundle name, lhp.yaml). All files are created in the current working directory.
Usage
lhp init [OPTIONS] PROJECT_NAME
Options
- --no-bundle¶
Skip Databricks Asset Bundle setup (bundle is enabled by default)
Arguments
- PROJECT_NAME¶
Required argument
list-presets¶
List available presets
Usage
lhp list-presets [OPTIONS]
list-templates¶
List available templates
Usage
lhp list-templates [OPTIONS]
show¶
Show resolved configuration for a flowgroup in table format
Usage
lhp show [OPTIONS] FLOWGROUP
Options
- -e, --env <env>¶
Environment
Arguments
- FLOWGROUP¶
Required argument
state¶
Show or manage the current state of generated files.
Usage
lhp state [OPTIONS]
Options
- -e, --env <env>¶
Environment to show state for
- -p, --pipeline <pipeline>¶
Specific pipeline to show state for
- --orphaned¶
Show only orphaned files
- --stale¶
Show only stale files (YAML changed)
- --new¶
Show only new/untracked YAML files
- --dry-run¶
Preview cleanup without actually deleting files
- --cleanup¶
Clean up orphaned files
- --regen¶
Regenerate stale files
stats¶
Display pipeline statistics and complexity metrics.
Usage
lhp stats [OPTIONS]
Options
- -p, --pipeline <pipeline>¶
Specific pipeline to analyze
substitutions¶
Show available substitution tokens for an environment
Usage
lhp substitutions [OPTIONS]
Options
- -e, --env <env>¶
Environment
validate¶
Validate pipeline configurations
Usage
lhp validate [OPTIONS]
Options
- -e, --env <env>¶
Environment
- -p, --pipeline <pipeline>¶
Specific pipeline to validate
- -v, --verbose¶
Verbose output
- --include-tests¶
Include test actions in validation (matches generate behavior)
See also
For detailed information about dependency analysis and orchestration job generation, see Databricks Asset Bundles Integration.
Project Initialization¶
The lhp init command creates a new LakehousePlumber project with the complete directory structure and configuration templates.
Basic Usage¶
# Create a standard project
lhp init my_project
# Create a Databricks Asset Bundle project
lhp init my_project --bundle
Created Directory Structure:
my_project/
├── config/ # Configuration templates
│ ├── job_config.yaml.tmpl
│ └── pipeline_config.yaml.tmpl
├── pipelines/ # Pipeline flowgroups
├── templates/ # Reusable action templates
├── presets/ # Configuration presets
├── substitutions/ # Environment-specific values
├── schemas/ # Table schema definitions
├── expectations/ # Data quality expectations
├── generated/ # Generated Python code (gitignored)
└── resources/ # Bundle resources (--bundle only)
Configuration Templates¶
The config/ directory contains template files for customizing Databricks job and pipeline settings:
job_config.yaml.tmpl
Template for customizing orchestration job configuration used with lhp deps command:
Job execution settings (max_concurrent_runs, timeout, performance_target)
Queue configuration
Email and webhook notifications
Scheduling (Quartz cron expressions)
Tags and permissions
Usage:
# 1. Copy and customize the template
cd my_project
cp config/job_config.yaml.tmpl config/job_config.yaml
# Edit config/job_config.yaml with your settings
# 2. Use with lhp deps command
lhp deps --job-config config/job_config.yaml --bundle-output
See Databricks Asset Bundles Integration for detailed job configuration options.
pipeline_config.yaml.tmpl
Template for customizing Delta Live Tables (DLT) pipeline settings used with lhp generate command:
Compute configuration (serverless vs. classic clusters)
DLT edition and runtime channel
Processing mode (continuous vs. triggered)
Notifications and tags
Event logging (can also be configured project-wide in
lhp.yamlwithout requiring-pc)
Usage:
# 1. Copy and customize the template
cp config/pipeline_config.yaml.tmpl templates/bundle/pipeline_config.yaml
# Edit templates/bundle/pipeline_config.yaml with your settings
# 2. Auto-loaded when generating (if at default location)
lhp generate -e dev
# 3. Or specify custom path
lhp generate -e dev --pipeline-config config/my_pipeline_config.yaml
See Databricks Asset Bundles Integration for detailed pipeline configuration options.
Note
Template files (*.tmpl) are provided as starting points. Copy and remove the
.tmpl extension to activate them. This allows you to keep the original
templates as reference while customizing your own versions.
Force Regeneration of Pipeline Resources¶
The --force flag combined with --pipeline-config allows you to regenerate bundle pipeline resource YAML files even when they haven’t changed.
Behavior:
--forcealone: Regenerates Python code but preserves LHP-generated bundle YAML files--forcewith-pc: Regenerates both Python code AND LHP-generated bundle YAML filesUser-created bundle YAML files are always backed up and replaced (regardless of flags)
When to Use:
Use this when you’ve modified your pipeline configuration and need to update the Databricks Asset Bundle resource files:
# Update bundle resources after changing pipeline config
lhp generate -e dev --force --pipeline-config config/my_pipeline_config.yaml
# Or with short flags
lhp generate -e dev -f -pc config/my_pipeline_config.yaml
Note
LHP-generated files are overwritten directly without backup when using --force with -pc.
This is safe because LHP can always regenerate them. User-created files are backed up for safety.
Test Generation Workflow¶
By default, Lakehouse Plumber skips test actions during code generation for faster builds and cleaner production pipelines. Use the --include-tests flag to include data quality tests when needed.
Common Usage Patterns:
# Production builds (default) - fast, clean pipelines
lhp generate -e prod
lhp generate -e dev
# Development with data quality testing
lhp generate -e dev --include-tests
# Validate without test actions (default)
lhp validate -e dev
# Validate with test actions
lhp validate -e dev --include-tests
Flag Behavior:
Both commands respect --include-tests for per-flowgroup processing.
Without flag: Test actions are skipped during per-flowgroup processing in both validation and generation
With flag: Test actions are included — validated for configuration errors and generated as temporary DLT tables
Examples:
# Skip tests for faster CI/CD builds
lhp generate -e prod --force --dry-run
# Include tests for comprehensive validation
lhp generate -e dev --include-tests --force
# Preview test generation without writing files
lhp generate -e dev --include-tests --dry-run
Note
Test actions generate temporary DLT tables that persist test results for inspection and debugging while being automatically cleaned up when the pipeline completes.
Tip
When test_reporting is configured in lhp.yaml, the --include-tests
flag also generates a test reporting event hook that publishes DQ results to
external systems. See Test Result Reporting (Publishing).