Catalog-native sources
Use this for Unity Catalog, Hive metastore or Spark catalog objects already available to the runtime.
Connector
Use catalog-native sources when data is already registered in Spark or Unity Catalog.
Choose table, delta_table or view for registered objects. Choose sql when the source must be a reviewed extraction query rather than a reusable catalog object.
Use this for Unity Catalog, Hive metastore or Spark catalog objects already available to the runtime.
Prefer views when source-side filtering or joins are reused by more than one ingestion contract.
Use SQL for bounded extraction logic. Promote complex business transformations to views, notebooks or Silver/Gold contracts.
This connector has no external driver or credential layer beyond catalog permissions.
| Requirement | Details |
|---|---|
| Catalog access | The job principal must be able to read the source table, view or SQL dependencies. |
| Fully qualified names | Use catalog/schema/table names in shared workspaces to avoid session-default surprises. |
| SQL review | Keep inline SQL small enough to review. Large logic should live in versioned views or transformations. |
| Watermark columns | If using incremental reads, the selected table/query must expose the watermark columns. |
source:
type: connector
connector: table
table: main.raw.orderssource:
type: connector
connector: sql
query: |
SELECT id, updated_at, amount
FROM main.raw.orders
WHERE status = 'closed'| Connector | Use |
|---|---|
table | Registered Spark/Unity Catalog table. |
delta_table | Registered Delta table when you want to make the storage type explicit. |
view | Registered view used as source object. |
sql | Inline SQL query declared in the contract. |
Use SQL sources for small, reviewed extraction predicates. If the SQL becomes a business transformation, promote it to a view, notebook step or a Silver/Gold contract.
source:
type: connector
connector: sql
query: |
SELECT order_id, updated_at, amount
FROM main.raw.orders
WHERE updated_at >= timestamp '2026-01-01'
watermark_columns: [updated_at]SELECT run_id, status, source_connector, source_table, source_query, rows_read, rows_written
FROM main.ops.ctrl_ingestion_runs
WHERE source_connector IN ('table', 'delta_table', 'view', 'sql')
ORDER BY started_at_utc DESC;| Symptom | Likely cause | Action |
|---|---|---|
| Table not found | Unqualified name or wrong catalog/schema context. | Use fully qualified names. |
| Permission denied | Job principal lacks catalog/table grants. | Grant source read permissions outside ContractForge. |
| Inline SQL is hard to review | Business transformation is hidden in source extraction. | Move logic to a view or downstream contract. |
Keep extraction concerns in source, structural normalization in transform, validation in quality_rules and target semantics in mode. This separation keeps examples portable and prevents connector-specific workarounds from becoming hidden business logic.