Governance
Catalog and access governance are contracts too.
ContractForge separates table metadata from access control because they usually have different permissions, reviewers and operational blast radius.
Annotations
Annotations describe the table and its fields. They can be applied after the target exists, and failures follow the configured policy.
# contracts/silver/s_customers.annotations.yaml
target:
catalog: main
schema: crm_curated
table: s_customers
policy: warn
table:
description: Current customer dimension.
tags:
domain: crm
contains_pii: "true"
columns:
email:
description: Customer email address.
pii:
enabled: true
type: email
sensitivity: restricted
from contractforge import ingest
result = ingest(
source_table="main.crm_raw.customers",
target_catalog="main",
target_schema="crm_curated",
target_table="s_customers",
mode="scd1_hash_diff",
hash_keys=["customer_id"],
annotations={
"policy": "warn",
"table": {
"description": "Current customer dimension.",
"tags": {"domain": "crm", "contains_pii": "true"},
},
"columns": {
"email": {
"description": "Customer email address.",
"pii": {
"enabled": True,
"type": "email",
"sensitivity": "restricted",
},
},
},
},
)
Operations metadata
Operations contracts feed dashboards and alerting systems without coupling the library to Slack, Teams, PagerDuty or another notification provider.
# contracts/silver/s_customers.operations.yaml
target:
catalog: main
schema: crm_curated
table: s_customers
operations:
business_owner: sales-ops
technical_owner: data-engineering
support_group: data-platform
criticality: high
expected_frequency: daily
freshness_sla_minutes: 180
runbook_url: https://example.com/runbooks/customers
from contractforge import ingest
result = ingest(
source_table="main.crm_raw.customers",
target_catalog="main",
target_schema="crm_curated",
target_table="s_customers",
mode="scd1_hash_diff",
hash_keys=["customer_id"],
operations={
"business_owner": "sales-ops",
"technical_owner": "data-engineering",
"support_group": "data-platform",
"criticality": "high",
"expected_frequency": "daily",
"freshness_sla_minutes": 180,
"runbook_url": "https://example.com/runbooks/customers",
},
)
Access is deferred by design
Normal ingestion should not require elevated security privileges. Access contracts are validated and applied through dedicated commands.
contractforge validate-access contracts/silver/s_customers
contractforge governance-check contracts/silver/s_customers
contractforge apply-access contracts/silver/s_customers
A data writer can load a table without becoming the principal responsible for grants, row filters or column masks.
Policy behavior
Governance failures are operationally different from write failures. The policy controls whether metadata/access drift should block the workflow, be recorded as warning evidence or be ignored intentionally.
| Area | Policy | Behavior | When to use |
|---|---|---|---|
| Annotations | warn | Attempt to apply comments/tags; record warnings without failing the data write. | Default for gradual rollout and mixed permissions. |
| Annotations | fail | Fail the ingestion result if annotations cannot be applied. | Use when catalog metadata is part of the release contract. |
| Annotations | ignore | Skip annotation application. | Use for environments without catalog permissions. |
| Access | validate_only | Validate access contract and record preview/drift evidence without applying changes. | Pull requests, CI and separation-of-duties workflows. |
| Access | apply | Apply grants, filters and masks through a privileged workflow. | Deployment jobs owned by a security principal. |
| Access | ignore | Skip access checks and application. | Local development or non-Unity-Catalog environments. |
Governance lifecycle
- Validate the bundle. Check that annotations, operations and access rules reference fields that exist in the ingestion target.
- Run ingestion. Data is written first so the table and columns physically exist.
- Apply annotations and operations. Comments, tags and ownership records are applied and logged.
- Apply access separately. Security changes are previewed, validated and executed through the dedicated access workflow.
PII as annotation, not transformation
PII classification describes the data. It should not automatically mask or mutate values during ingestion. Masking belongs in catalog access policies or downstream security functions.
# In *.annotations.yaml
columns:
national_id:
description: Government-issued identifier.
pii:
enabled: true
type: national_id
sensitivity: confidential
tags:
retention_policy: restricted
masking_candidate: "true"
annotations = {
"columns": {
"national_id": {
"description": "Government-issued identifier.",
"pii": {
"enabled": True,
"type": "national_id",
"sensitivity": "confidential",
},
"tags": {
"retention_policy": "restricted",
"masking_candidate": "true",
},
},
},
}
Drift and audit tables
Governance actions are written to control tables so teams can answer what metadata was applied, what failed and what was intentionally deferred.
| Table | Use |
|---|---|
ctrl_ingestion_annotations | Applied, warned, ignored or failed table/column annotations. |
ctrl_ingestion_operations | Ownership, criticality, SLA and support metadata snapshots. |
ctrl_ingestion_access | Access validation/application evidence, including drift status. |
Useful governance queries
Tables with PII annotations
SELECT
target_table,
column_name,
key,
value,
status,
annotation_ts_utc
FROM main.ops.ctrl_ingestion_annotations
WHERE annotation_type = 'pii'
ORDER BY annotation_ts_utc DESC;
Access actions that require follow-up
SELECT
target_table,
access_type,
principal,
privilege,
column_name,
function_name,
status,
error_message,
access_ts_utc
FROM main.ops.ctrl_ingestion_access
WHERE status <> 'APPLIED'
ORDER BY access_ts_utc DESC;