Framework-agnostic, low-latency reliability infrastructure for autonomous AI agents. Real-time cost governance, loop detection, and probabilistic replay. Our architecture is in-process, no proxies, no vendor lock-in.
Traditional observability fails because AI is non-deterministic.
One agent stuck in a semantic infinite loop exhausted a $47K monthly LLM budget in 11 days undetected until the invoice arrived.
Agent returns HTTP 200 OK - but the business outcome is zero or wrong. Non-determinism makes single-trace debugging meaningless. Engineers spend 40-60% of time debugging with no systematic solution.
54% of orgs use 11+ observability tools. Proxy-based approaches add latency and introduce new failure modes (CVE-2025-66405: SSRF in proxy-based tracing).
67% of organizations report gains from AI agent pilots, but only 10% successfully scale to production. The primary reason: inability to monitor and debug agent behavior at scale.
Built on a shared open-source core: kazenai-core
Both products integrate in minutes and share zero-reconfiguration data.
The only pre-emptive cost circuit-breaker that pauses agents with full state preservation and does not kill them. Integrates in 3 lines of code.
Current: $1.23 / Projected: $8.40 / Budget: $5.00
Debugging an AI agent is a statistical problem, not a software one. AgentLens is the only platform that gives you the distribution of outcomes and not just a single trace.
Every architectural decision was made to solve specific failure modes that existing tools cannot and not to add features to what already exists.
Traces every dollar of spend back to the originating user request across orchestrator and sub-agent trees. Produces true per-request P&L, enabling pricing decisions and identifying the most expensive query patterns.
kazenai-core uses monkey-patching on framework callback systems: in-process, same thread, no network hop. Portkey's SSRF vulnerability (CVE-2025-66405) proves why proxy-based approaches are a liability, not a feature.
Adapted Deep Isolation Forest algorithm processes cost time-series in real time. A loop produces a distinctive signature: rising cost + cosine similarity → 1.0 between consecutive tool outputs. Fires independently of budget threshold.
Re-runs any production trace N times to produce tool selection distributions, outcome variance scores, failure rates, cost distributions (p95/p99), and step-level entropy. As of March 2026, no competitor has shipped this.
No data leaves your infrastructure by default. Local SQLite for dev, DynamoDB/Postgres for production. Privacy Vault built on AWS Nitro Enclaves enables healthcare, financial services, and legal AI (SOC 2, HIPAA BAA roadmap).
Both products are built on kazenai-core (Apache 2.0 licensed, open source).
Install Agent FinOps and get AgentLens trace capture free on upgrade.
Future products (AgentGuard, AgentSim) build on the same base.
Every layer designed for production AI systems, not retrofitted from traditional APM.
| Layer | Component | Technology | Notes |
|---|---|---|---|
| Framework Hooks | kazenai-core interceptor | Python 3.10+, in-process | Monkey-patches framework callbacks. Supports LangGraph, AutoGen, CrewAI, raw OpenAI SDK. |
| Cost Engine | TokenCostEngine | Real-time pricing table | Token counting + cost calculation. Pricing table updated weekly for all major model providers. |
| Trajectory Model | CostTrajectoryPredictor | Exponential smoothing | Recursion-depth estimator detects growing fan-out. Runs O(1) per step. No external API calls. |
| Loop Detector | StreamingIsolationForest | Deep IF (MSc-derived) | Adapted for streaming cost time-series. Flags anomalous patterns in real time, independent of budget threshold. |
| Circuit Breaker | AgentStateSerializer | Pluggable backends | Serialises agent memory + tool history to local / S3 / DynamoDB. Issues resume token. Sends webhook. |
| Drift Monitor | SemanticDriftMonitor | MiniLM-L6, KS-test + KL-divergence | Baseline updates async. Statistical tests for distribution comparison. Catches model updates and prompt injection. |
| Replay Engine | ProbabilisticReplayEngine | Async Python + Ray (optional) | Single-machine mode for small N. Scales to 500 replays. Statistical analysis: NumPy + PyTorch (CPU-only). |
| Storage | TraceStore / FinOpsStore | Parquet (S3) + DynamoDB | Local SQLite for dev. Pluggable, swap backends without code changes. |
Supported Frameworks: Framework-Agnostic by Design
Any Python agent can be wrapped with the @trace decorator or context manager API with zero framework dependency.
Open-source SDK is always free. Upgrade to hosted dashboard, team features, and enterprise support as your agents reach production.
For individual developers and open-source projects.
For small teams and AI startups shipping to production.
For engineering teams with multiple agents in production.
For regulated industries requiring full data sovereignty.
Agent FinOps Pro/Team users receive a 40% discount on the equivalent AgentLens tier. Cost data captured by Agent FinOps imports automatically into AgentLens with zero reconfiguration.
Stop building circuit-breakers manually. Stop debugging single traces. Start shipping agents you can trust at any scale.
Free tier includes unlimited SDK · Apache 2.0 licensed · No credit card required