Project YAML
project.yaml describes how a group of ContractForge contracts is delivered as
one ingestion project. It is not an ingestion contract and it must not contain
dataset semantics such as source columns, write mode, quality rules or access
policies.
Use it for repository-level concerns:
- which environments exist;
- where reusable connection YAMLs live;
- how contract steps depend on each other;
- how a project is scheduled;
- which platform artifacts should be deployed;
- which validation commands and evidence checks prove the delivery.
The goal is minimum platform drift: portable project fields stay at the top level; adapter differences live under explicit adapter blocks.
Canonical shape
name: supabase_jdbc_medallion
description: Real Supabase/PostgreSQL medallion ingestion on Databricks and AWS.
source_system:
name: supabase_inventory_demo
type: postgresql
environments:
databricks: environments/databricks.environment.yaml
aws: environments/aws.environment.yaml
connections:
supabase_postgres: connections/supabase.yaml
deployment:
databricks:
bundle_name: contractforge_supabase_jdbc_medallion
job_key: supabase_jdbc_medallion
job_name: contractforge_supabase_jdbc_medallion
workspace_root_path: /Workspace/Shared/contractforge-examples/Supabase_JDBC_Medallion
aws:
state_machine_name: contractforge_supabase_jdbc_medallion
schedule:
cron: "0 6 * * *"
timezone: America/Sao_Paulo
enabled: false
max_concurrent_runs: 1
queue: true
adapters:
databricks:
pause_status: PAUSED
tasks:
bronze_supabase_products:
task_key: bronze_products
aws:
state: DISABLED
execution_order:
- name: bronze_supabase_products
layer: bronze
depends_on: []
contracts:
databricks: contracts/databricks/bronze/bronze_supabase_products/bronze_supabase_products.ingestion.yaml
aws: contracts/aws/bronze/bronze_supabase_products/bronze_supabase_products.ingestion.yaml
- name: silver_supabase_products
layer: silver
depends_on: [bronze_supabase_products]
contracts:
databricks: contracts/databricks/silver/silver_supabase_products/silver_supabase_products.ingestion.yaml
aws: contracts/aws/silver/silver_supabase_products/silver_supabase_products.ingestion.yaml
Field reference
| Field | Required | Owner | Purpose |
|---|---|---|---|
name | yes | core | Stable project id. Used for default job, bundle, state machine and schedule names. |
environments | yes for deploy | core | Maps environment keys to *.environment.yaml files. |
connections | no | core | Named reusable connection YAMLs. Ingestion contracts still use source.connection_path. |
deployment | no | adapters | Adapter deployment metadata only. |
schedule | no | core plus adapters | Core-owned schedule intent plus adapter-specific scheduler overrides. |
execution_order | yes for projects | core | Ordered contract steps and dependencies. |
validation | no | adapters/tooling | Test, deploy and audit hints used by adapter CLIs and examples. |
portability | no | documentation/tooling | Explains intentional contract differences between platforms. |
Environments
environments:
databricks: environments/databricks.environment.yaml
aws: environments/aws.environment.yaml
The environment file chooses adapter, evidence location and runtime/deployment parameters. It must not contain ingestion semantics.
Reusable connection YAML
connections is a project inventory of shared connector files:
connections:
supabase_postgres: connections/supabase.yaml
The actual inheritance happens in the ingestion contract:
source:
type: connection
connection_path: project://connections/supabase.yaml
table: public.products
read:
partition_column: product_id
lower_bound: 1
upper_bound: 100000
num_partitions: 8
Connection file:
type: connector
connector: postgres
system: supabase_inventory_demo
options:
url: "{{ secret:supabase/jdbc_url }}"
driver: org.postgresql.Driver
auth:
type: username_password
username: "{{ secret:supabase/user }}"
password: "{{ secret:supabase/password }}"
read:
fetchsize: 10000
The core bundle loader resolves the connection file before adapter planning and
deep-merges it with dataset-specific source fields. The connection YAML provides
defaults; the ingestion source block overrides them. This includes nested
fields such as read.fetchsize, read.num_partitions or options.driver.
For the complete merge behavior, examples and path safety rules, see Connection YAML.
Use project://connections/... for centralized project connections. Use
same-bundle relative paths only when the connection file lives under the
ingestion bundle directory. Absolute paths and .. traversal are rejected by
the loader.
Deployment
deployment:
databricks:
bundle_name: contractforge_supabase_jdbc_medallion
job_key: supabase_jdbc_medallion
job_name: contractforge_supabase_jdbc_medallion
workspace_root_path: /Workspace/Shared/contractforge-examples/Supabase_JDBC_Medallion
aws:
state_machine_name: contractforge_supabase_jdbc_medallion
Deployment blocks may name native jobs, bundles, state machines, workspace paths or artifact roots. They must not redefine write modes, source semantics, quality rules or governance policy.
Schedule
schedule:
cron: "0 6 * * *"
timezone: America/Sao_Paulo
enabled: false
max_concurrent_runs: 1
queue: true
adapters:
databricks:
pause_status: PAUSED
aws:
state: DISABLED
Core-owned fields:
| Field | Meaning |
|---|---|
cron | Standard five-field cron: minute, hour, day-of-month, month, day-of-week. |
timezone | IANA timezone name, for example America/Sao_Paulo. |
enabled | Portable intent. false means deploy the schedule paused/disabled. |
max_concurrent_runs | Portable concurrency intent. Adapter support varies. |
queue | Portable queueing intent. Adapter support varies. |
Adapter translations:
| Adapter | Translation |
|---|---|
| Databricks | cron: "0 6 * * *" -> Jobs Quartz cron 0 0 6 * * ?; timezone -> timezone_id; enabled: false or pause_status: PAUSED -> paused schedule. |
| AWS | cron: "0 6 * * *" -> EventBridge Scheduler cron(0 6 * * ? *); timezone -> ScheduleExpressionTimezone; enabled: false or state: DISABLED -> disabled schedule. |
Use schedule.adapters.<adapter> only for platform-specific overrides, such as
AWS flexible_time_window or Databricks pause_status.
Execution order
execution_order:
- name: bronze_orders
layer: bronze
depends_on: []
contracts:
databricks: contracts/databricks/bronze/bronze_orders/bronze_orders.ingestion.yaml
aws: contracts/aws/bronze/bronze_orders/bronze_orders.ingestion.yaml
- name: silver_orders
layer: silver
depends_on: [bronze_orders]
contracts:
databricks: contracts/databricks/silver/silver_orders/silver_orders.ingestion.yaml
aws: contracts/aws/silver/silver_orders/silver_orders.ingestion.yaml
Databricks renders steps as tasks in one Asset Bundle job. AWS renders steps as Glue jobs and can orchestrate them with Step Functions. Future adapters should consume the same DAG before adding platform-specific deployment details.
Logical table references
Use logical references when a downstream contract reads a table produced by an earlier ContractForge step:
source:
type: table
ref: bronze.b_products_jdbc
or in SQL:
FROM {{ table_ref:silver.s_product_tags }}
The core preserves the logical layer.table reference. Adapters render the
platform-qualified name.
Validation and portability metadata
validation stores test and presentation hints. portability documents
intentional differences when platform contracts cannot be byte-for-byte
identical. Neither section authorizes semantic downgrades.
validation:
databricks:
bundle: databricks.yml
job_name: contractforge_supabase_jdbc_medallion
aws:
artifact_bucket_parameter: CONTRACTFORGE_AWS_ARTIFACT_BUCKET
portability:
invariant_contract_intent:
- source.type
- target.layer
- mode
- schema_policy
- quality_rules
platform_bindings:
databricks:
table_format: delta
aws:
table_format: iceberg
Commands
Core validation:
uv run contractforge validate-project examples/real-world/supabase-jdbc-medallion
Databricks:
uv run contractforge-databricks render-project-bundle examples/real-world/supabase-jdbc-medallion/project.yaml `
--output databricks.yml `
--force
AWS:
uv run contractforge-aws deploy-project examples/real-world/supabase-jdbc-medallion/project.yaml `
--dry-run `
--render-orchestration `
--summary-only
Anti-patterns
- Do not put
source,target,mode, quality rules or access policy inproject.yaml. - Do not put one cron under
schedule.adapters.awsand another underschedule.adapters.databricksunless the difference is intentional and documented. - Do not place secrets in
project.yaml; use environment contracts and secret references. - Do not make the core aware of Databricks Jobs, AWS Step Functions, Fabric pipeline JSON or Snowflake tasks. The core owns project intent; adapters own native artifacts.