Skip to main content

ContractForge overview

ContractForge is a multi-runtime contract-first ingestion platform. It defines ingestion intent in a platform-neutral semantic core, evaluates that intent against adapter capabilities, and lets each adapter render or execute native runtime artifacts.

The product name is ContractForge. contractforge-core, contractforge-databricks, contractforge-aws and future adapter packages are delivery boundaries: the core owns contract semantics, while adapters own platform behavior.

Product model

ContractForge is useful when a team needs the same governed ingestion standard across client or enterprise runtimes without rewriting the framework for each platform.

What stays portable

Portable semantics belong in the contract and the core:

  • source intent, target intent and naming;
  • append, overwrite, merge/upsert and SCD intent;
  • schema policy;
  • quality rules;
  • transformations and shape operations that are platform-neutral;
  • annotations, ownership, operations and access intent;
  • evidence requirements.

Platform-specific behavior belongs in adapters. For Databricks this includes Delta implementation details, Unity Catalog SQL, Auto Loader, Lakeflow, Jobs, Asset Bundles and runtime execution helpers. For AWS this includes Glue Spark scripts, Iceberg table setup, Glue job definitions, S3 artifact publication, Lake Formation review/apply helpers and AWS runtime APIs.

Honest portability

ContractForge does not promise that every contract runs everywhere. The planner returns an explicit result:

StatusMeaning
SUPPORTEDThe adapter can preserve the contract semantics.
SUPPORTED_WITH_WARNINGSThe adapter can run the plan, but there are documented differences or operational caveats.
REVIEW_REQUIREDThe adapter found a possible mapping, but the semantic or operational tradeoff must be accepted before deployment.
UNSUPPORTEDThe adapter cannot preserve the requested behavior.

No adapter should silently downgrade historical mode to append, merge to overwrite, row filters to comments, or evidence to best-effort logs.

Contract sections

ingestion

Ingestion

Source, target, write mode, schema policy, quality, transform, shape, watermarks and execution intent.

annotations

Annotations

Descriptions, aliases, tags, PII classification, lifecycle metadata and catalog-facing documentation.

operations

Operations

Owner, support group, criticality, frequency, SLA, runbook and operational metadata.

access

Access

Grants, row filters, column masks and drift policy. The adapter decides how the target platform applies them.

Package boundaries

PackageResponsibility
contractforge-coreContract models, validation, semantic normalization, capability matching, abstract plans, portability diagnostics and neutral evidence models.
contractforge-databricksDatabricks capabilities, Delta/Unity Catalog/Lakeflow/Auto Loader rendering, Databricks runtime execution, Asset Bundles and platform-specific evidence filling.
contractforge-awsAWS Glue/Iceberg capabilities, S3 artifact publication, Glue job deployment, Lake Formation review/apply helpers and Iceberg evidence tables.
Future adaptersFabric, Snowflake, GCP or client-specific implementations that depend on the core and declare their own capabilities.

The core does not import Spark, Databricks SDK, boto3, Azure SDK, Fabric SDK or Snowflake clients.

Real parity proof

The Supabase JDBC medallion example runs the same ingestion design on Databricks and AWS. It uses a shared reusable connection YAML, logical downstream table refs such as source.ref: bronze.b_products_jdbc, split annotations and operations contracts, quality quarantine, and ContractForge evidence/control tables.

In AWS, contracts are rendered and published to S3 before Glue execution. In Databricks, the adapter executes the split bundle directly through the Databricks runtime. The contract intent stays the same; only the environment and adapter-owned runtime settings change.

Start here

Non-goals

ContractForge is not a scheduler, a universal Spark wrapper, a dbt replacement, a GUI product, or a promise that every platform has identical semantics. It is the contract, planning, adapter and evidence layer for repeatable governed ingestion.