ContractForge overview
ContractForge is a multi-runtime contract-first ingestion platform. It defines ingestion intent in a platform-neutral semantic core, evaluates that intent against adapter capabilities, and lets each adapter render or execute native runtime artifacts.
The product name is ContractForge. contractforge-core, contractforge-databricks, contractforge-aws and future adapter packages are delivery boundaries: the core owns contract semantics, while adapters own platform behavior.
Product model
ContractForge is useful when a team needs the same governed ingestion standard across client or enterprise runtimes without rewriting the framework for each platform.
What stays portable
Portable semantics belong in the contract and the core:
- source intent, target intent and naming;
- append, overwrite, merge/upsert and SCD intent;
- schema policy;
- quality rules;
- transformations and shape operations that are platform-neutral;
- annotations, ownership, operations and access intent;
- evidence requirements.
Platform-specific behavior belongs in adapters. For Databricks this includes Delta implementation details, Unity Catalog SQL, Auto Loader, Lakeflow, Jobs, Asset Bundles and runtime execution helpers. For AWS this includes Glue Spark scripts, Iceberg table setup, Glue job definitions, S3 artifact publication, Lake Formation review/apply helpers and AWS runtime APIs.
Honest portability
ContractForge does not promise that every contract runs everywhere. The planner returns an explicit result:
| Status | Meaning |
|---|---|
SUPPORTED | The adapter can preserve the contract semantics. |
SUPPORTED_WITH_WARNINGS | The adapter can run the plan, but there are documented differences or operational caveats. |
REVIEW_REQUIRED | The adapter found a possible mapping, but the semantic or operational tradeoff must be accepted before deployment. |
UNSUPPORTED | The adapter cannot preserve the requested behavior. |
No adapter should silently downgrade historical mode to append, merge to overwrite, row filters to comments, or evidence to best-effort logs.
Contract sections
Ingestion
Source, target, write mode, schema policy, quality, transform, shape, watermarks and execution intent.
Annotations
Descriptions, aliases, tags, PII classification, lifecycle metadata and catalog-facing documentation.
Operations
Owner, support group, criticality, frequency, SLA, runbook and operational metadata.
Access
Grants, row filters, column masks and drift policy. The adapter decides how the target platform applies them.
Package boundaries
| Package | Responsibility |
|---|---|
contractforge-core | Contract models, validation, semantic normalization, capability matching, abstract plans, portability diagnostics and neutral evidence models. |
contractforge-databricks | Databricks capabilities, Delta/Unity Catalog/Lakeflow/Auto Loader rendering, Databricks runtime execution, Asset Bundles and platform-specific evidence filling. |
contractforge-aws | AWS Glue/Iceberg capabilities, S3 artifact publication, Glue job deployment, Lake Formation review/apply helpers and Iceberg evidence tables. |
| Future adapters | Fabric, Snowflake, GCP or client-specific implementations that depend on the core and declare their own capabilities. |
The core does not import Spark, Databricks SDK, boto3, Azure SDK, Fabric SDK or Snowflake clients.
Real parity proof
The Supabase JDBC medallion example runs the same ingestion design on Databricks and AWS. It uses a shared reusable connection YAML, logical downstream table refs such as source.ref: bronze.b_products_jdbc, split annotations and operations contracts, quality quarantine, and ContractForge evidence/control tables.
In AWS, contracts are rendered and published to S3 before Glue execution. In Databricks, the adapter executes the split bundle directly through the Databricks runtime. The contract intent stays the same; only the environment and adapter-owned runtime settings change.
Start here
Non-goals
ContractForge is not a scheduler, a universal Spark wrapper, a dbt replacement, a GUI product, or a promise that every platform has identical semantics. It is the contract, planning, adapter and evidence layer for repeatable governed ingestion.