Project YAML

project.yaml describes how a group of ContractForge contracts is delivered as one ingestion project. It is not an ingestion contract and it must not contain dataset semantics such as source columns, write mode, quality rules or access policies.

Use it for repository-level concerns:

which environments exist;
where reusable connection YAMLs live;
how contract steps depend on each other;
how a project is scheduled;
which platform artifacts should be deployed;
which validation commands and evidence checks prove the delivery.

The goal is minimum platform drift: portable project fields stay at the top level; adapter differences live under explicit adapter blocks.

Canonical shape

name: supabase_jdbc_medallion
description: Real Supabase/PostgreSQL medallion ingestion on Databricks and AWS.

source_system:
  name: supabase_inventory_demo
  type: postgresql

environments:
  databricks: environments/databricks.environment.yaml
  aws: environments/aws.environment.yaml

connections:
  supabase_postgres: connections/supabase.yaml

deployment:
  databricks:
    bundle_name: contractforge_supabase_jdbc_medallion
    job_key: supabase_jdbc_medallion
    job_name: contractforge_supabase_jdbc_medallion
    workspace_root_path: /Workspace/Shared/contractforge-examples/Supabase_JDBC_Medallion
  aws:
    state_machine_name: contractforge_supabase_jdbc_medallion

schedule:
  cron: "0 6 * * *"
  timezone: America/Sao_Paulo
  enabled: false
  max_concurrent_runs: 1
  queue: true
  adapters:
    databricks:
      pause_status: PAUSED
      tasks:
        bronze_supabase_products:
          task_key: bronze_products
    aws:
      state: DISABLED

execution_order:
  - name: bronze_supabase_products
    layer: bronze
    depends_on: []
    contracts:
      databricks: contracts/databricks/bronze/bronze_supabase_products/bronze_supabase_products.ingestion.yaml
      aws: contracts/aws/bronze/bronze_supabase_products/bronze_supabase_products.ingestion.yaml

  - name: silver_supabase_products
    layer: silver
    depends_on: [bronze_supabase_products]
    contracts:
      databricks: contracts/databricks/silver/silver_supabase_products/silver_supabase_products.ingestion.yaml
      aws: contracts/aws/silver/silver_supabase_products/silver_supabase_products.ingestion.yaml

Field reference

Field	Required	Owner	Purpose
`name`	yes	core	Stable project id. Used for default job, bundle, state machine and schedule names.
`environments`	yes for deploy	core	Maps environment keys to `*.environment.yaml` files.
`connections`	no	core	Named reusable connection YAMLs. Ingestion contracts still use `source.connection_path`.
`deployment`	no	adapters	Adapter deployment metadata only.
`schedule`	no	core plus adapters	Core-owned schedule intent plus adapter-specific scheduler overrides.
`execution_order`	yes for projects	core	Ordered contract steps and dependencies.
`validation`	no	adapters/tooling	Test, deploy and audit hints used by adapter CLIs and examples.
`portability`	no	documentation/tooling	Explains intentional contract differences between platforms.

Environments

environments:
  databricks: environments/databricks.environment.yaml
  aws: environments/aws.environment.yaml

The environment file chooses adapter, evidence location and runtime/deployment parameters. It must not contain ingestion semantics.

Reusable connection YAML

connections is a project inventory of shared connector files:

connections:
  supabase_postgres: connections/supabase.yaml

The actual inheritance happens in the ingestion contract:

source:
  type: connection
  connection_path: project://connections/supabase.yaml
  table: public.products
  read:
    partition_column: product_id
    lower_bound: 1
    upper_bound: 100000
    num_partitions: 8

Connection file:

type: connector
connector: postgres
system: supabase_inventory_demo
options:
  url: "{{ secret:supabase/jdbc_url }}"
  driver: org.postgresql.Driver
auth:
  type: username_password
  username: "{{ secret:supabase/user }}"
  password: "{{ secret:supabase/password }}"
read:
  fetchsize: 10000

The core bundle loader resolves the connection file before adapter planning and deep-merges it with dataset-specific source fields. The connection YAML provides defaults; the ingestion source block overrides them. This includes nested fields such as read.fetchsize, read.num_partitions or options.driver.

For the complete merge behavior, examples and path safety rules, see Connection YAML.

Use project://connections/... for centralized project connections. Use same-bundle relative paths only when the connection file lives under the ingestion bundle directory. Absolute paths and .. traversal are rejected by the loader.

Deployment

deployment:
  databricks:
    bundle_name: contractforge_supabase_jdbc_medallion
    job_key: supabase_jdbc_medallion
    job_name: contractforge_supabase_jdbc_medallion
    workspace_root_path: /Workspace/Shared/contractforge-examples/Supabase_JDBC_Medallion
  aws:
    state_machine_name: contractforge_supabase_jdbc_medallion

Deployment blocks may name native jobs, bundles, state machines, workspace paths or artifact roots. They must not redefine write modes, source semantics, quality rules or governance policy.

Schedule

schedule:
  cron: "0 6 * * *"
  timezone: America/Sao_Paulo
  enabled: false
  max_concurrent_runs: 1
  queue: true
  adapters:
    databricks:
      pause_status: PAUSED
    aws:
      state: DISABLED

Core-owned fields:

Field	Meaning
`cron`	Standard five-field cron: minute, hour, day-of-month, month, day-of-week.
`timezone`	IANA timezone name, for example `America/Sao_Paulo`.
`enabled`	Portable intent. `false` means deploy the schedule paused/disabled.
`max_concurrent_runs`	Portable concurrency intent. Adapter support varies.
`queue`	Portable queueing intent. Adapter support varies.

Adapter translations:

Adapter	Translation
Databricks	`cron: "0 6 * * "` -> Jobs Quartz cron `0 0 6 * ?`; timezone -> `timezone_id`; `enabled: false` or `pause_status: PAUSED` -> paused schedule.
AWS	`cron: "0 6 * * "` -> EventBridge Scheduler `cron(0 6 * ? *)`; timezone -> `ScheduleExpressionTimezone`; `enabled: false` or `state: DISABLED` -> disabled schedule.

Use schedule.adapters.<adapter> only for platform-specific overrides, such as AWS flexible_time_window or Databricks pause_status.

Execution order

execution_order:
  - name: bronze_orders
    layer: bronze
    depends_on: []
    contracts:
      databricks: contracts/databricks/bronze/bronze_orders/bronze_orders.ingestion.yaml
      aws: contracts/aws/bronze/bronze_orders/bronze_orders.ingestion.yaml

  - name: silver_orders
    layer: silver
    depends_on: [bronze_orders]
    contracts:
      databricks: contracts/databricks/silver/silver_orders/silver_orders.ingestion.yaml
      aws: contracts/aws/silver/silver_orders/silver_orders.ingestion.yaml

Databricks renders steps as tasks in one Asset Bundle job. AWS renders steps as Glue jobs and can orchestrate them with Step Functions. Future adapters should consume the same DAG before adding platform-specific deployment details.

Logical table references

Use logical references when a downstream contract reads a table produced by an earlier ContractForge step:

source:
  type: table
  ref: bronze.b_products_jdbc

or in SQL:

FROM {{ table_ref:silver.s_product_tags }}

The core preserves the logical layer.table reference. Adapters render the platform-qualified name.

Validation and portability metadata

validation stores test and presentation hints. portability documents intentional differences when platform contracts cannot be byte-for-byte identical. Neither section authorizes semantic downgrades.

validation:
  databricks:
    bundle: databricks.yml
    job_name: contractforge_supabase_jdbc_medallion
  aws:
    artifact_bucket_parameter: CONTRACTFORGE_AWS_ARTIFACT_BUCKET

portability:
  invariant_contract_intent:
    - source.type
    - target.layer
    - mode
    - schema_policy
    - quality_rules
  platform_bindings:
    databricks:
      table_format: delta
    aws:
      table_format: iceberg

Commands

Core validation:

uv run contractforge validate-project examples/real-world/supabase-jdbc-medallion

Databricks:

uv run contractforge-databricks render-project-bundle examples/real-world/supabase-jdbc-medallion/project.yaml `
  --output databricks.yml `
  --force

AWS:

uv run contractforge-aws deploy-project examples/real-world/supabase-jdbc-medallion/project.yaml `
  --dry-run `
  --render-orchestration `
  --summary-only

Anti-patterns

Do not put source, target, mode, quality rules or access policy in project.yaml.
Do not put one cron under schedule.adapters.aws and another under schedule.adapters.databricks unless the difference is intentional and documented.
Do not place secrets in project.yaml; use environment contracts and secret references.
Do not make the core aware of Databricks Jobs, AWS Step Functions, Fabric pipeline JSON or Snowflake tasks. The core owns project intent; adapters own native artifacts.

Canonical shape​

Field reference​

Environments​

Reusable connection YAML​

Deployment​

Schedule​

Execution order​

Logical table references​

Validation and portability metadata​

Commands​

Anti-patterns​