Skip to main content

Test Contracts Across Adapters

Define ingestion intent once. Run it natively anywhere.

This page shows the USGS GeoJSON medallion ingestion reused across Databricks, AWS and Snowflake. The contract set keeps the business intent, quality rules, write behavior and medallion outputs the same. The few differences are contract parameters needed to bind the same intent to each adapter's native source, table names, SQL references and storage options.

The point of this page is contract reuse: the same medallion contract family is carried across adapters, and only the minimal platform bindings change.

Reused Contract Set

ContractShared ingestion intent
bronze_usgs_geojson.ingestion.yamlRead the USGS 2.5 day GeoJSON payload into one raw response row with raw_response and response_page_number.
silver_usgs_events.ingestion.yamlParse features[], produce one row per earthquake event, derive event fields, filter invalid event ids and deduplicate by earthquake_id.
gold_usgs_daily_summary.ingestion.yamlAggregate silver events by event_date with event counts, tsunami counts, magnitude/depth metrics, reporting networks and latest update time.
gold_usgs_magnitude_bands.ingestion.yamlAggregate silver events by magnitude_band and event_type with event counts, magnitude metrics and first/latest event times.

Shared Contract Parameters

Parameter groupContent kept the same
Execution orderbronze_usgs_geojson, silver_usgs_events, gold_usgs_daily_summary, gold_usgs_magnitude_bands.
Source payloadUSGS 2.5_day.geojson earthquake feed for the same date window and filters.
Bronze write behaviorlayer: bronze, mode: overwrite, schema_policy: permissive.
Bronze quality rulesrequired_columns: [raw_response, response_page_number], not_null: [raw_response], unique_key: [response_page_number].
Silver write behaviorlayer: silver, mode: overwrite, schema_policy: additive_only.
Silver transformation intentShape one event per features[] item, standardize timestamps and coordinates, derive event_date, magnitude_band and is_tsunami_related, filter earthquake_id IS NOT NULL, deduplicate by earthquake_id.
Silver quality rulesnot_null: [earthquake_id, event_time, latitude, longitude], unique_key: [earthquake_id], coordinate/depth/magnitude validation expressions.
Gold write behaviorlayer: gold, mode: overwrite, schema_policy: additive_only.
Gold quality rulesDaily summary requires event_date; magnitude bands requires magnitude_band.
Expected output rowsBronze 1, silver 30, gold daily 2, gold magnitude bands 3.

Shared Contract Content

The snippets below show the contract content that is reused, not only the adapter-specific differences. Exact equality means the YAML block is copied unchanged across the named adapters. Normalized equality means only native table references or SQL dialect syntax differ; the contract intent and output shape remain the same.

Bronze Source Block

Exact same block in Databricks, AWS and Snowflake:

source:
type: rest_api
name: usgs_earthquake_2_5_day_geojson
system: usgs
request:
url: https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.geojson
method: GET
headers:
Accept: application/geo+json, application/json
User-Agent: ContractForge real ingestion test
response:
mode: raw
raw_column: raw_response
limits:
timeout_seconds: 60
retry_attempts: 3
retry_backoff_seconds: 2
max_page_bytes: 10485760
max_total_bytes: 10485760
max_records: 1

Exact same bronze lifecycle and quality block in Databricks, AWS and Snowflake:

layer: bronze
mode: overwrite
schema_policy: permissive

quality_rules:
required_columns: [raw_response, response_page_number]
not_null: [raw_response]
unique_key: [response_page_number]

Silver Event Contract

The Databricks and AWS silver contracts share the same declarative event-shaping contract exactly. Snowflake expresses the same output contract as native SQL because its runtime path executes SQL directly.

layer: silver
mode: overwrite
schema_policy: additive_only
select_columns: [raw_response, response_page_number]

schemas:
usgs_geojson_feed: |
STRUCT<
type: STRING,
metadata: STRUCT<generated: BIGINT, url: STRING, title: STRING, status: BIGINT, api: STRING, count: BIGINT>,
bbox: ARRAY<DOUBLE>,
features: ARRAY<STRUCT<
type: STRING,
id: STRING,
properties: STRUCT<
mag: DOUBLE, place: STRING, time: BIGINT, updated: BIGINT, tz: BIGINT, url: STRING,
detail: STRING, felt: BIGINT, cdi: DOUBLE, mmi: DOUBLE, alert: STRING, status: STRING,
tsunami: BIGINT, sig: BIGINT, net: STRING, code: STRING, ids: STRING, sources: STRING,
types: STRING, nst: BIGINT, dmin: DOUBLE, rms: DOUBLE, gap: DOUBLE, magType: STRING,
type: STRING, title: STRING
>,
geometry: STRUCT<type: STRING, coordinates: ARRAY<DOUBLE>>
>>
>

transform:
shape:
parse_json:
- column: raw_response
alias: payload
schema_ref: usgs_geojson_feed
arrays:
- path: payload.features
mode: explode_outer
alias: feature
columns:
payload.metadata.generated: {alias: feed_generated_epoch_ms, cast: BIGINT}
payload.metadata.title: {alias: feed_title, cast: STRING}
payload.metadata.count: {alias: feed_event_count, cast: BIGINT}
payload.metadata.api: {alias: feed_api_version, cast: STRING}
payload.bbox: {alias: feed_bbox}
feature.id: {alias: earthquake_id, cast: STRING}
feature.type: {alias: geojson_feature_type, cast: STRING}
feature.properties.title: {alias: event_title, cast: STRING}
feature.properties.place: {alias: place, cast: STRING}
feature.properties.mag: {alias: magnitude, cast: DOUBLE}
feature.properties.magType: {alias: magnitude_type, cast: STRING}
feature.properties.time: {alias: event_epoch_ms, cast: BIGINT}
feature.properties.updated: {alias: updated_epoch_ms, cast: BIGINT}
feature.properties.status: {alias: event_status, cast: STRING}
feature.properties.type: {alias: event_type, cast: STRING}
feature.properties.alert: {alias: alert_level, cast: STRING}
feature.properties.tsunami: {alias: tsunami_flag, cast: INT}
feature.properties.sig: {alias: significance, cast: BIGINT}
feature.properties.net: {alias: network, cast: STRING}
feature.properties.code: {alias: network_event_code, cast: STRING}
feature.properties.url: {alias: event_url, cast: STRING}
feature.properties.detail: {alias: detail_url, cast: STRING}
feature.properties.felt: {alias: felt_reports, cast: BIGINT}
feature.properties.cdi: {alias: community_intensity, cast: DOUBLE}
feature.properties.mmi: {alias: instrumental_intensity, cast: DOUBLE}
feature.geometry.type: {alias: geometry_type, cast: STRING}
feature.geometry.coordinates: {alias: coordinates}
longitude: {alias: longitude, expression: "CAST(feature.geometry.coordinates[0] AS DOUBLE)"}
latitude: {alias: latitude, expression: "CAST(feature.geometry.coordinates[1] AS DOUBLE)"}
depth_km: {alias: depth_km, expression: "CAST(feature.geometry.coordinates[2] AS DOUBLE)"}
standardize:
event_status: {trim: true, lower: true, empty_as_null: true}
event_type: {trim: true, lower: true, empty_as_null: true}
magnitude_type: {trim: true, lower: true, empty_as_null: true}
alert_level: {trim: true, lower: true, empty_as_null: true}
network: {trim: true, lower: true}
derive:
event_time: CAST(from_unixtime(event_epoch_ms / 1000) AS TIMESTAMP)
updated_at: CAST(from_unixtime(updated_epoch_ms / 1000) AS TIMESTAMP)
feed_generated_at: CAST(from_unixtime(feed_generated_epoch_ms / 1000) AS TIMESTAMP)
event_date: to_date(CAST(from_unixtime(event_epoch_ms / 1000) AS TIMESTAMP))
magnitude_band: CASE WHEN magnitude IS NULL THEN 'unknown' WHEN magnitude < 3 THEN 'minor' WHEN magnitude < 5 THEN 'light' WHEN magnitude < 7 THEN 'strong' ELSE 'major' END
is_tsunami_related: tsunami_flag = 1
normalized_at_utc: CURRENT_TIMESTAMP()
deduplicate:
keys: [earthquake_id]
order_by:
- {column: updated_at, direction: desc, nulls: last}
- {column: feed_generated_at, direction: desc, nulls: last}

filter_expression: earthquake_id IS NOT NULL

Exact same silver quality block in Databricks, AWS and Snowflake:

quality_rules:
not_null: [earthquake_id, event_time, latitude, longitude]
unique_key: [earthquake_id]
expressions:
- name: valid_geojson_point
expression: geometry_type = 'Point'
severity: warn
message: USGS summary feeds should expose point geometries.
- name: valid_coordinates
expression: latitude BETWEEN -90 AND 90 AND longitude BETWEEN -180 AND 180
severity: abort
message: Coordinates must be valid WGS84 latitude/longitude.
- name: reasonable_depth
expression: depth_km IS NULL OR depth_km BETWEEN -20 AND 800
severity: warn
message: Earthquake depth is outside the expected operational range.
- name: non_negative_magnitude
expression: magnitude IS NULL OR magnitude >= 0
severity: warn
message: Magnitude should not be negative for normal earthquake events.

Gold Daily Summary Contract

Normalized same query shape in Databricks, AWS and Snowflake. The source table identifier differs by adapter; Snowflake uses IFF for the boolean sum.

SELECT
event_date,
COUNT(*) AS earthquake_count,
SUM(CASE WHEN is_tsunami_related THEN 1 ELSE 0 END) AS tsunami_related_count,
AVG(magnitude) AS avg_magnitude,
MAX(magnitude) AS max_magnitude,
AVG(depth_km) AS avg_depth_km,
COUNT(DISTINCT network) AS reporting_networks,
MAX(updated_at) AS last_event_update_at,
CURRENT_TIMESTAMP() AS computed_at_utc
FROM <silver_events_table>
WHERE event_date IS NOT NULL
GROUP BY event_date

Exact same daily summary lifecycle and quality block in Databricks, AWS and Snowflake:

layer: gold
mode: overwrite
schema_policy: additive_only

quality_rules:
not_null: [event_date]
expressions:
- name: positive_daily_count
expression: earthquake_count > 0
severity: abort
message: Daily summary rows must represent at least one earthquake.

Gold Magnitude Bands Contract

Normalized same query body in Databricks, AWS and Snowflake. Only the source table identifier changes.

SELECT
magnitude_band,
event_type,
COUNT(*) AS event_count,
MIN(magnitude) AS min_magnitude,
AVG(magnitude) AS avg_magnitude,
MAX(magnitude) AS max_magnitude,
MIN(event_time) AS first_event_time,
MAX(event_time) AS latest_event_time,
CURRENT_TIMESTAMP() AS computed_at_utc
FROM <silver_events_table>
GROUP BY magnitude_band, event_type

Exact same magnitude-band lifecycle and quality block in Databricks, AWS and Snowflake:

layer: gold
mode: overwrite
schema_policy: additive_only

quality_rules:
not_null: [magnitude_band]
expressions:
- name: positive_band_count
expression: event_count > 0
severity: abort
message: Magnitude-band summary rows must represent at least one event.

Annotations And Operations

The annotation and operation contracts keep the same content across adapters. For example, the bronze annotation block is:

table:
description: Raw USGS Earthquake GeoJSON feed response.
tags:
domain: geospatial
provider: usgs
source_format: geojson
columns:
raw_response:
description: Raw GeoJSON FeatureCollection response body.
response_page_number:
description: REST connector page number.

The bronze operations block is also identical across adapters:

criticality: medium
expected_frequency: daily
freshness_sla_minutes: 1440
alert_on_failure: true
alert_on_quality_fail: true
runbook_url: https://example.com/runbooks/contractforge/usgs-earthquake-feed
ownership:
business_owner: platform-data
technical_owner: data-engineering
tags:
project: usgs-rest-medallion
layer: bronze

Adapter Binding Differences

The reusable contract surface is larger than the adapter-specific surface. The source connector, bronze raw response shape, medallion order, quality gates, write mode and gold business outputs stay aligned. The differences below are native bindings: where each adapter stores tables, how it references the previous layer and which platform prerequisite is needed to execute the same intent.

Sourcerest_api + same USGS GeoJSON URL
Shaperaw_response bronze payload, same silver event model
Resultsame medallion order and row-count expectations
Databricks

Unity Catalog and Delta

Contract logic is declarative; native binding is catalog/schema/table plus a Delta table property.

Bronze target
workspace.cf_usgs_rest_bronze.b_usgs_earthquake_geojson
Silver target
workspace.cf_usgs_rest_silver.s_usgs_earthquake_events
Gold targets
workspace.cf_usgs_rest_gold.g_usgs_earthquake_daily_summaryworkspace.cf_usgs_rest_gold.g_usgs_earthquake_magnitude_bands
Native hint
delta.enableChangeDataFeed: "true"
AWS

Glue Catalog and Iceberg

Contract logic is declarative; native binding adds Glue/Iceberg references and an S3 warehouse location.

Bronze target
contractforge.cf_usgs_rest_bronze.b_usgs_earthquake_geojson
Silver input
glue_catalog.contractforge_cf_usgs_rest_bronze.b_usgs_earthquake_geojson
Gold targets
contractforge.cf_usgs_rest_gold.g_usgs_earthquake_daily_summarycontractforge.cf_usgs_rest_gold.g_usgs_earthquake_magnitude_bands
Native hint
s3://.../warehouse/usgs-rest/
Snowflake

Database schema and SQL

The REST source is the same; native binding adds database objects, SQL text and external access.

Bronze target
CONTRACTFORGE_TEST_DB.PUBLIC.CF_USGS_REST_BRONZE
Silver target
CONTRACTFORGE_TEST_DB.PUBLIC.CF_USGS_REST_SILVER
Gold targets
CONTRACTFORGE_TEST_DB.PUBLIC.CF_USGS_REST_GOLD_DAILYCONTRACTFORGE_TEST_DB.PUBLIC.CF_USGS_REST_GOLD_BANDS
Runtime prerequisite
CF_USGS_REST_ACCESS

Difference Ledger

ConcernWhat stays the sameNative difference
REST sourceAll adapters declare source.type: rest_api, method: GET, the same URL and the same raw response column.Snowflake hosted procedure execution also needs the CF_USGS_REST_ACCESS external access integration.
Table bindingEach contract writes bronze, silver and gold layers with the same logical role.Catalog, schema and table names follow each platform's naming model.
Layer referencesSilver reads bronze; gold reads silver.Fully qualified references differ between Unity Catalog, Glue Catalog/Iceberg and Snowflake database objects.
Silver transformationThe output event shape, filters, derived fields and deduplication rule are the same.Databricks and AWS use the declarative transform path; Snowflake uses native SQL text with PARSE_JSON, LATERAL FLATTEN and QUALIFY.
Storage hintsStorage configuration does not change business logic or quality semantics.Databricks sets a Delta property; AWS sets an Iceberg warehouse path; this Snowflake contract has no storage extension.

Contract Parameter Snippets

Shared quality and write intent:

mode: overwrite
schema_policy: additive_only
quality_rules:
not_null: [earthquake_id, event_time, latitude, longitude]
unique_key: [earthquake_id]

Databricks bronze binding:

source:
type: rest_api
request:
method: GET
url: https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.geojson
target:
catalog: workspace
schema: cf_usgs_rest_bronze
table: b_usgs_earthquake_geojson

AWS bronze binding:

source:
type: rest_api
request:
method: GET
url: https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.geojson
target:
catalog: contractforge
schema: cf_usgs_rest_bronze
table: b_usgs_earthquake_geojson

Snowflake bronze binding:

source:
type: rest_api
request:
method: GET
url: https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.geojson
target:
catalog: CONTRACTFORGE_TEST_DB
schema: PUBLIC
table: CF_USGS_REST_BRONZE