Troubleshooting.
Start with the run id, then move through control tables. Avoid debugging from notebook output alone; the durable evidence is in Delta.
First queries
SELECT *
FROM ops.ctrl_ingestion_runs
WHERE run_id = '<run_id>';
SELECT stack_trace
FROM ops.ctrl_ingestion_errors
WHERE run_id = '<run_id>'
ORDER BY error_ts_utc DESC;
Common failures
Credentials or network unavailable
Check secret scope, runtime type, external location, network policy and whether the connector requires driver-side or Spark-side access.
Connector returned HTTP 401 or 403
Treat this as deterministic. Check the provider credential, role, site permission, API scope, Snowflake network policy or object-storage grant. ContractForge does not retry most 4xx responses because retrying cannot fix authorization.
Connector returned HTTP 429, 5xx or timeout
These can be transient for Python-driver connectors. Review source.limits.retry_attempts, source.limits.retry_backoff_seconds and source.limits.timeout_seconds, then check provider rate limits and network stability.
Driver-download connector exceeded limits
Inspect source_limits_json and source_metrics_json. If the source regularly exceeds max_bytes, max_files or max_records, land the feed in object storage and ingest it with file connectors or Auto Loader.
Schema policy blocked the run
Inspect ctrl_ingestion_schema_changes and decide whether the change is expected. Do not loosen schema policy blindly.
Quality failed
Check ctrl_ingestion_quality and ctrl_ingestion_quarantine. Abort-only rules cannot be quarantined.
Watermark did not advance
Skipped and failed runs do not update state. Empty batches preserve previous state.
Serverless cannot access storage
Use Unity Catalog External Locations or workspace network policy. Direct credential setup is usually a classic-cluster pattern.
Connector troubleshooting fields
Connector metadata is persisted with secrets redacted. Start with these columns before changing a contract:
SELECT
run_id,
source_connector,
source_type,
source_format,
source_path,
source_options_json,
source_read_json,
source_auth_json,
source_limits_json,
source_metrics_json,
error_message
FROM ops.ctrl_ingestion_runs
WHERE run_id = '<run_id>';
Use source_metrics_json to distinguish connector behavior from Spark write behavior. Fields such as runtime_strategy, read_strategy, bytes_read, records_read, files_read, pages_read and materialization_strategy are present only when the connector can compute them without an extra expensive scan.
Operator checklist
- Identify
run_idand finalstatus. - Read the short
error_messageinctrl_ingestion_runs. - Open the full redacted stack trace in
ctrl_ingestion_errors. - Check source metadata and runtime type.
- Check quality and schema control tables.
- Only then rerun, repair or change the contract.