Reference
CLI and API entry points.
Use this page as the operational entry-point reference for commands, public Python functions, runtime options and common contract fields.
CLI
contractforge validate path/to/table.ingestion.yaml
contractforge validate-project contracts/
contractforge validate-bundle contracts/silver/s_customers
contractforge connectors list
contractforge connectors show rest_api jdbc s3
contractforge connectors doctor rest_api jdbc s3
contractforge templates list
contractforge templates wizard
contractforge maintenance ctrl-retention --catalog main --ctrl-schema ops
contractforge maintenance cost-report --catalog main --ctrl-schema ops
| Command | Use | Typical stage |
|---|---|---|
validate | Validates one ingestion contract file. | Pull request and local authoring. |
validate-project | Walks a contracts directory and validates all supported contract files. | CI for a repository of contracts. |
validate-bundle | Loads split contract files for the same table and validates cross-file consistency. | CI before deployment. |
connectors list | Lists built-in and registered source connectors. | Discovery and onboarding. |
connectors show | Displays connector-specific contract shape and runtime notes. | Authoring a source contract. |
connectors doctor | Performs static capability diagnostics without opening network connections. | Runtime preparation and review. |
templates list | Lists packaged contract templates. | Starting new tables. |
templates wizard | Generates a contract scaffold from guided prompts. | Onboarding and standardization. |
maintenance ctrl-retention | Applies control-table retention. | Scheduled operations. |
maintenance cost-report | Builds an estimated runtime cost report from control-table durations and user-provided rates. | FinOps reporting. |
Python API
| Function | Use | Notes |
|---|---|---|
ingest() | Direct Python invocation with kwargs, connector dictionaries or DataFrame source. | Notebook-friendly API. Builds an immutable plan and executes it. |
ingest_plan() | Execute a prepared IngestionPlan. | Use when the application constructs or validates plans before execution. |
build_plan_from_kwargs() | Normalize kwargs into IngestionPlan. | Useful for tools that need validation without execution. |
load_contract_bundle() | Load split YAML contracts. | Reads *.ingestion.yaml, *.annotations.yaml, *.operations.yaml and *.access.yaml. |
ingest_bundle() | Execute an ingestion bundle with governance context. | Recommended entry point for contract repositories. |
apply_annotations_bundle() | Apply annotations without re-running ingestion. | Useful when metadata changes independently from data movement. |
apply_access_bundle() | Apply access contract through the dedicated governance path. | Use from a principal with security permissions. |
governance_check() | Validate governance contracts and report drift/status. | Use in CI or privileged deployment workflows. |
register_source_resolver() | Add custom source connectors. | Use for organization-specific source systems. |
register_write_mode() | Add custom write modes when built-in modes are not enough. | Keep write-mode extensions rare and tested. |
register_quality_rule() | Add custom quality rules. | Use after built-in expressions are not enough. |
Runtime options
| Option | Default | Use |
|---|---|---|
raise_on_failure | true | Raise ContractForgeExecutionError after failed runs are logged. |
dry_run | false | Validate contract, source shape and plan without writing target/control side effects. |
explain_mode | false | Capture Spark explain plans. Prefer in development or CI because it can add overhead. |
openlineage_enabled | config-dependent | Persist OpenLineage-compatible events for forwarding or audit. |
Common contract fields
| Field | Type | Meaning |
|---|---|---|
source | object/DataFrame/string | Connector, table, SQL, path, API or DataFrame source definition. |
target | object | Physical catalog/schema/table. target.schema can differ from logical layer. |
catalog, target_schema, target_table | string | Python-friendly equivalents of the target block. |
layer | string | Logical classification for presets, observability and conventions. It no longer has to match the physical schema. |
mode | enum | Write semantics: append, overwrite, upsert, hash diff, SCD2 or snapshot. |
select_columns | list | Keep only selected source columns before transformations. |
column_mapping | map | Rename source columns before technical columns are added. |
transform | object | Structural normalization and deduplication before schema/quality/write. |
filter_expression | string | Spark SQL filter applied to the source. Do not use with complete snapshot soft delete. |
watermark_columns | list | Incremental filter columns persisted in state. |
merge_keys, hash_keys | list | Business keys used by MERGE and hash-diff modes. |
dedup_order_expr | string | SQL ORDER BY fragment used to choose the latest row per key. |
cluster_columns, zorder_columns | list | Physical optimization hints for Databricks/Delta layout. |
delta_properties | map | Delta table properties, such as CDF and retention settings. |
schema_policy | enum | Controls target schema handling and additive evolution. |
allow_type_widening | boolean | Allows safe type widening when schema policy permits it. |
quality_rules | object | Rules with fail, warn, quarantine or abort semantics. |
on_quality_fail | enum | Legacy/global quality behavior when rule-level severity is not enough. |
execution | object | Backfill, explicit windows and catchup configuration. |
idempotency_key, idempotency_policy | string | Duplicate-run handling. |
retry_attempts, retry_backoff_seconds | integer | Plan-level retry policy for transient retryable failures. |
annotations, operations, access | object | Inline governance contracts for Python-driven usage. |
Quality rule fields
| Field | Use | Quarantine behavior |
|---|---|---|
required_columns | Ensure fields exist in the source schema. | Set-level rule; fails/aborts rather than quarantining individual rows. |
not_null | Reject nulls in required business fields. | Row-level and quarantineable. |
unique_key | Ensure a key is unique in the prepared source. | Set-level rule; duplicate ambiguity usually aborts writes. |
accepted_values | Restrict categorical values. | Row-level and quarantineable. |
min_rows | Reject empty or unexpectedly small source batches. | Set-level rule. |
max_null_ratio | Reject columns whose null ratio exceeds a threshold. | Aggregate rule; not a row-isolation rule. |
expressions | Named Spark SQL boolean rules with severity and message. | Can warn, quarantine or abort depending on severity. |
custom | Custom registered quality rule payloads. | Defined by the registered rule implementation. |
Error handling
from contractforge import ContractForgeExecutionError, ingest_bundle
try:
result = ingest_bundle("contracts/silver/s_customers")
except ContractForgeExecutionError as exc:
# The failed run is already recorded in ctrl tables.
print(exc.result["run_id"])
print(exc.result["error_message"])
raise