Operations
Troubleshooting.
Start with the run id, then move through control tables. Avoid debugging from notebook output alone; the durable evidence is in Delta.
First queries
SELECT *
FROM ops.ctrl_ingestion_runs
WHERE run_id = '<run_id>';
SELECT stack_trace
FROM ops.ctrl_ingestion_errors
WHERE run_id = '<run_id>'
ORDER BY error_ts_utc DESC;Common failures
Credentials or network unavailable
Check secret scope, runtime type, external location, network policy and whether the connector requires driver-side or Spark-side access.
Schema policy blocked the run
Inspect ctrl_ingestion_schema_changes and decide whether the change is expected. Do not loosen schema policy blindly.
Quality failed
Check ctrl_ingestion_quality and ctrl_ingestion_quarantine. Abort-only rules cannot be quarantined.
Watermark did not advance
Skipped and failed runs do not update state. Empty batches preserve previous state.
Serverless cannot access storage
Use Unity Catalog External Locations or workspace network policy. Direct credential setup is usually a classic-cluster pattern.
Operator checklist
- Identify
run_idand finalstatus. - Read the short
error_messageinctrl_ingestion_runs. - Open the full redacted stack trace in
ctrl_ingestion_errors. - Check source metadata and runtime type.
- Check quality and schema control tables.
- Only then rerun, repair or change the contract.