Skip to main content

Project YAML

project.yaml describes how a group of ContractForge contracts is delivered as one ingestion project. It is not an ingestion contract and it must not contain dataset semantics such as source columns, write mode, quality rules or access policies.

Use it for repository-level concerns:

  • which environments exist;
  • where reusable connection YAMLs live;
  • how contract steps depend on each other;
  • how a project is scheduled;
  • which platform artifacts should be deployed;
  • which validation commands and evidence checks prove the delivery.

The goal is minimum platform drift: portable project fields stay at the top level; adapter differences live under explicit adapter blocks.

Canonical shape

name: supabase_jdbc_medallion
description: Real Supabase/PostgreSQL medallion ingestion on Databricks and AWS.

source_system:
name: supabase_inventory_demo
type: postgresql

environments:
databricks: environments/databricks.environment.yaml
aws: environments/aws.environment.yaml

connections:
supabase_postgres: connections/supabase.yaml

deployment:
databricks:
bundle_name: contractforge_supabase_jdbc_medallion
job_key: supabase_jdbc_medallion
job_name: contractforge_supabase_jdbc_medallion
workspace_root_path: /Workspace/Shared/contractforge-examples/Supabase_JDBC_Medallion
aws:
state_machine_name: contractforge_supabase_jdbc_medallion

schedule:
cron: "0 6 * * *"
timezone: America/Sao_Paulo
enabled: false
max_concurrent_runs: 1
queue: true
adapters:
databricks:
pause_status: PAUSED
tasks:
bronze_supabase_products:
task_key: bronze_products
aws:
state: DISABLED

execution_order:
- name: bronze_supabase_products
layer: bronze
depends_on: []
contracts:
databricks: contracts/databricks/bronze/bronze_supabase_products/bronze_supabase_products.ingestion.yaml
aws: contracts/aws/bronze/bronze_supabase_products/bronze_supabase_products.ingestion.yaml

- name: silver_supabase_products
layer: silver
depends_on: [bronze_supabase_products]
contracts:
databricks: contracts/databricks/silver/silver_supabase_products/silver_supabase_products.ingestion.yaml
aws: contracts/aws/silver/silver_supabase_products/silver_supabase_products.ingestion.yaml

Field reference

FieldRequiredOwnerPurpose
nameyescoreStable project id. Used for default job, bundle, state machine and schedule names.
environmentsyes for deploycoreMaps environment keys to *.environment.yaml files.
connectionsnocoreNamed reusable connection YAMLs. Ingestion contracts still use source.connection_path.
deploymentnoadaptersAdapter deployment metadata only.
schedulenocore plus adaptersCore-owned schedule intent plus adapter-specific scheduler overrides.
execution_orderyes for projectscoreOrdered contract steps and dependencies.
validationnoadapters/toolingTest, deploy and audit hints used by adapter CLIs and examples.
portabilitynodocumentation/toolingExplains intentional contract differences between platforms.

Environments

environments:
databricks: environments/databricks.environment.yaml
aws: environments/aws.environment.yaml

The environment file chooses adapter, evidence location and runtime/deployment parameters. It must not contain ingestion semantics.

Reusable connection YAML

connections is a project inventory of shared connector files:

connections:
supabase_postgres: connections/supabase.yaml

The actual inheritance happens in the ingestion contract:

source:
type: connection
connection_path: project://connections/supabase.yaml
table: public.products
read:
partition_column: product_id
lower_bound: 1
upper_bound: 100000
num_partitions: 8

Connection file:

type: connector
connector: postgres
system: supabase_inventory_demo
options:
url: "{{ secret:supabase/jdbc_url }}"
driver: org.postgresql.Driver
auth:
type: username_password
username: "{{ secret:supabase/user }}"
password: "{{ secret:supabase/password }}"
read:
fetchsize: 10000

The core bundle loader resolves the connection file before adapter planning and deep-merges it with dataset-specific source fields. The connection YAML provides defaults; the ingestion source block overrides them. This includes nested fields such as read.fetchsize, read.num_partitions or options.driver.

For the complete merge behavior, examples and path safety rules, see Connection YAML.

Use project://connections/... for centralized project connections. Use same-bundle relative paths only when the connection file lives under the ingestion bundle directory. Absolute paths and .. traversal are rejected by the loader.

Deployment

deployment:
databricks:
bundle_name: contractforge_supabase_jdbc_medallion
job_key: supabase_jdbc_medallion
job_name: contractforge_supabase_jdbc_medallion
workspace_root_path: /Workspace/Shared/contractforge-examples/Supabase_JDBC_Medallion
aws:
state_machine_name: contractforge_supabase_jdbc_medallion

Deployment blocks may name native jobs, bundles, state machines, workspace paths or artifact roots. They must not redefine write modes, source semantics, quality rules or governance policy.

Schedule

schedule:
cron: "0 6 * * *"
timezone: America/Sao_Paulo
enabled: false
max_concurrent_runs: 1
queue: true
adapters:
databricks:
pause_status: PAUSED
aws:
state: DISABLED

Core-owned fields:

FieldMeaning
cronStandard five-field cron: minute, hour, day-of-month, month, day-of-week.
timezoneIANA timezone name, for example America/Sao_Paulo.
enabledPortable intent. false means deploy the schedule paused/disabled.
max_concurrent_runsPortable concurrency intent. Adapter support varies.
queuePortable queueing intent. Adapter support varies.

Adapter translations:

AdapterTranslation
Databrickscron: "0 6 * * *" -> Jobs Quartz cron 0 0 6 * * ?; timezone -> timezone_id; enabled: false or pause_status: PAUSED -> paused schedule.
AWScron: "0 6 * * *" -> EventBridge Scheduler cron(0 6 * * ? *); timezone -> ScheduleExpressionTimezone; enabled: false or state: DISABLED -> disabled schedule.

Use schedule.adapters.<adapter> only for platform-specific overrides, such as AWS flexible_time_window or Databricks pause_status.

Execution order

execution_order:
- name: bronze_orders
layer: bronze
depends_on: []
contracts:
databricks: contracts/databricks/bronze/bronze_orders/bronze_orders.ingestion.yaml
aws: contracts/aws/bronze/bronze_orders/bronze_orders.ingestion.yaml

- name: silver_orders
layer: silver
depends_on: [bronze_orders]
contracts:
databricks: contracts/databricks/silver/silver_orders/silver_orders.ingestion.yaml
aws: contracts/aws/silver/silver_orders/silver_orders.ingestion.yaml

Databricks renders steps as tasks in one Asset Bundle job. AWS renders steps as Glue jobs and can orchestrate them with Step Functions. Future adapters should consume the same DAG before adding platform-specific deployment details.

Logical table references

Use logical references when a downstream contract reads a table produced by an earlier ContractForge step:

source:
type: table
ref: bronze.b_products_jdbc

or in SQL:

FROM {{ table_ref:silver.s_product_tags }}

The core preserves the logical layer.table reference. Adapters render the platform-qualified name.

Validation and portability metadata

validation stores test and presentation hints. portability documents intentional differences when platform contracts cannot be byte-for-byte identical. Neither section authorizes semantic downgrades.

validation:
databricks:
bundle: databricks.yml
job_name: contractforge_supabase_jdbc_medallion
aws:
artifact_bucket_parameter: CONTRACTFORGE_AWS_ARTIFACT_BUCKET

portability:
invariant_contract_intent:
- source.type
- target.layer
- mode
- schema_policy
- quality_rules
platform_bindings:
databricks:
table_format: delta
aws:
table_format: iceberg

Commands

Core validation:

uv run contractforge validate-project examples/real-world/supabase-jdbc-medallion

Databricks:

uv run contractforge-databricks render-project-bundle examples/real-world/supabase-jdbc-medallion/project.yaml `
--output databricks.yml `
--force

AWS:

uv run contractforge-aws deploy-project examples/real-world/supabase-jdbc-medallion/project.yaml `
--dry-run `
--render-orchestration `
--summary-only

Anti-patterns

  • Do not put source, target, mode, quality rules or access policy in project.yaml.
  • Do not put one cron under schedule.adapters.aws and another under schedule.adapters.databricks unless the difference is intentional and documented.
  • Do not place secrets in project.yaml; use environment contracts and secret references.
  • Do not make the core aware of Databricks Jobs, AWS Step Functions, Fabric pipeline JSON or Snowflake tasks. The core owns project intent; adapters own native artifacts.