Core behavior
Watermarks and backfill.
Watermarks persist incremental progress. Backfill and catchup windows let users process historical ranges without manually editing state.
Simple and composite watermarks
watermark_columns: [updated_at]
dedup_order_expr: "updated_at DESC NULLS LAST"result = ingest(
...,
watermark_columns=["updated_at"],
dedup_order_expr="updated_at DESC NULLS LAST",
)watermark_columns: [event_date, event_sequence]
dedup_order_expr: "event_date DESC NULLS LAST, event_sequence DESC NULLS LAST"result = ingest(
...,
watermark_columns=["event_date", "event_sequence"],
dedup_order_expr="event_date DESC NULLS LAST, event_sequence DESC NULLS LAST",
)State behavior
- Previous state is loaded from
ctrl_ingestion_state. - The watermark filter is applied before write.
- The current watermark is computed after preparation and quality.
- State is not advanced for failed or skipped executions.
- Watermark values are encoded with type information to preserve comparisons.
Backfill windows
execution:
window:
column: updated_at
start: "2026-01-01T00:00:00Z"
end: "2026-02-01T00:00:00Z"
interval: 1 day
label_format: "%Y-%m-%d"result = ingest(
...,
execution={
"window": {
"column": "updated_at",
"start": "2026-01-01T00:00:00Z",
"end": "2026-02-01T00:00:00Z",
"interval": "1 day",
"label_format": "%Y-%m-%d",
}
},
)Catchup from state
Catchup uses the saved watermark as the start point when no explicit start is supplied. Each generated window becomes a child execution with its own idempotency suffix.
execution:
catchup:
enabled: true
column: updated_at
end: "2026-05-01T00:00:00Z"
interval: 1 daySnapshot modes are not incremental
Do not combine snapshot_soft_delete with watermarks or filters. A complete snapshot is required to know which keys disappeared.