Governed Execution for Data Systems

CFA v2

Agentic systems jump from prompt to action. CFA puts governance, validation, and state between those two points -- formalizing intent into a typed contract before any execution happens.

Python 3.11+ 203 tests MIT

Architectural Reference

Whitepaper

Full specification: invariants, components, fault model, execution flow, and formal scope for CFA v2.

Open English whitepaper ->

Problem

3 gaps in agents and skills

Gap 1

Silent ambiguity

The model misinterprets and executes confidently. CFA formalizes intent into a StateSignature before acting.

Gap 2

No governance

Skills run without checking PII, cost, or schema. The PolicyEngine evaluates declarative rules before execution.

Gap 3

No state model

Nobody knows what state the data ended up in. The ContextRegistry projects and persists state after each execution.

Modular Architecture

Use only what you need

Each module works independently. The full pipeline orchestrates all three together, but it is not required to get started.

cfa.governance

Governance

Validates operations against 7 declarative rules. No LLM, no Spark, no infrastructure needed. Pluggable into Airflow, Dagster, or any script.

cfa.resolution

Semantic Resolution

Turns natural language into a typed contract (StateSignature). Escalates to human approval when risk is high.

cfa.lifecycle

Lifecycle

Monitors recurring pipeline health with 4 quantitative indices. Promotes, demotes, or retires based on evidence.

Full Pipeline

13 governed stages

intent -> normalization -> confirmation -> policy -> planning -> codegen -> static validation -> sandbox -> runtime validation -> partial execution -> state projection -> audit -> lifecycle

Before execution

Contract + Policy

Intent becomes a typed signature, confirmed by risk level, evaluated against rules. The plan and code are generated and statically validated.

After execution

Validation + State

Sandbox collects metrics, runtime validation checks limits, partial failures have explicit policy, and the result is projected into environment state.

Lifecycle

4 health indices

Each pipeline accumulates quantitative evidence. Promotion and demotion are automatic decisions based on these indices.

IFo

Operational fluidity: latency, cost, success rate

IFs

Semantic fidelity: schema match, drift absence, fault-free rate

IFg

Governance: binary -- 1 if compliant, 0 on any violation

IDI

Intent drift: proportion of replans in a 30-day window

Promotion gate: IFo >= 0.75 AND IFs >= 0.90 AND IFg = 1