Skip to main content

System Architecture

This page explains Agenta's system architecture: what each component does and how they connect.

System Overview

Agenta uses a microservices architecture deployed as Docker containers. The diagram below shows how the main layers connect.

┌─────────────────────────────────────┐
│ Users │
│ (Developers, AI Engineers) │
└─────────────────┬───────────────────┘

┌─────────────────▼───────────────────┐
│ Load Balancer / Proxy │
│ (Traefik or Nginx) │
│ Handles SSL and routing │
└─────────────┬───────────────────────┘

┌─────────────────────────────┼─────────────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Frontend │ │ API Backend │ │ Services API │
│ (Web UI) │◄────────► (FastAPI) │◄────────► (FastAPI) │
│ │ │ │ │ │
│ • Next.js App │ │ • REST API │ │ • Completion │
│ • Playground │ │ • Core logic │ │ • Chat │
│ • Admin UI │ │ • Persistence │ │ • LLM adapters │
└─────────────────┘ └─────────┬───────┘ └────────┬────────┘
│ │ │
│ ▼ ▼
│ ┌─────────────────────────┐ ┌─────────────────┐
│ │ Worker Pool │ │ runner :8765 │
│ │ (background procs) │ │ (agent runs) │
│ │ • worker-streams │ └────────┬────────┘
│ │ (records/events/ │ │
│ │ spans) │ │
│ │ • worker-queues │ │
│ │ (webhooks/triggers/ │ │
│ │ interactions/evals) │ │
│ │ • cron │ │
│ └──────────────┬──────────┘ │
│ │ │
▼ ▼ ▼
┌──────────────────────────────────────────────────────────────────────────┐
│ Infrastructure Layer │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ PostgreSQL │ │ Redis │ │ SuperTokens │ │seaweedfs │ │
│ │ │ │ │ │ │ │ :8333 │ │
│ │ • Core DB │ │ • Task queues │ │ • Auth │ │(bundled │ │
│ │ • Tracing DB │ │ • Streams │ │ • Sessions │ │or ext S3)│ │
│ │ • Auth DB │ │ • Caching │ │ │ │ │ │
│ └──────────────────┘ └──────────────────┘ └──────────────┘ └──────────┘ │
└──────────────────────────────────────────────────────────────────────────┘

Frontend Components

Web UI (NextJS Application)

  • Technology: React, TypeScript, Next.js
  • Port: 3000 (internal)
  • Purpose: Primary user interface for Agenta platform

Key Responsibilities:

  • User Interface: Provides intuitive web interface for application management
  • Playground: Interactive environment for testing and evaluating LLM applications
  • Evaluation Dashboard: Visualizations and metrics for application performance
  • Application Management: Create, configure, and deploy AI applications
  • User Authentication: Login, registration, and session management

Backend Components

API Service (FastAPI)

  • Technology: Python, FastAPI, SQLAlchemy
  • Port: 8000 (internal)
  • Purpose: Core business logic and API endpoints

Key Responsibilities:

  • REST API: Provides RESTful endpoints for frontend and external integrations
  • Business Logic: Implements core platform functionality
  • Data Management: Handles CRUD operations for applications, evaluations, experiments, etc
  • Authentication: Integrates with SuperTokens for user authentication
  • Application Orchestration: Manages application lifecycle and deployment
  • Evaluation Management: Coordinates evaluation runs and result collection

Worker Services (TaskIQ + Async Consumers)

  • Technology: Python workers, TaskIQ, asyncio consumers, Redis, PostgreSQL
  • Purpose: Background processing for evaluations, tracing, events, and webhooks

Background work runs as two list-parameterized container kinds. Each kind hosts a family of loops in a single process, so the default deployment is one container of each rather than seven separate ones. Which loops a container runs is chosen by an environment selector; an empty selector runs the whole family.

  • worker-streams — Redis Streams consumers, selected by AGENTA_WORKER_STREAMS (subset of records, events, spans; empty ⇒ all three):
    • Span Ingestion: consumes the OTLP tracing pipeline (streams:spans).
    • Event Processing: processes internal event streams (streams:events).
    • Session Records: persists agent session records (streams:records).
  • worker-queues — TaskIQ queue consumers, selected by AGENTA_WORKER_QUEUES (subset of webhooks, triggers, interactions, evaluations; empty ⇒ all four):
    • Webhook Delivery: dispatches outbound webhook notifications (queues:webhooks).
    • Trigger Processing: processes trigger events for automated workflows (queues:triggers).
    • Interaction Dispatch: dispatches async session interactions (queues:interactions).
    • Evaluation Execution: runs asynchronous evaluation workloads (queues:evaluations).

Stream and queue names, consumer groups, and message shapes are unchanged by this grouping, so scaling a container out simply shares work across its consumer groups. To scale a single hot loop, run a second instance of the same kind with a disjoint selector (for example, AGENTA_WORKER_STREAMS=spans on its own).

TaskIQ Integration:

  • Broker: Uses Redis streams for queueing and task distribution
  • Task Registration: Queue tasks are registered at worker startup
  • Execution: Workers consume Redis-backed jobs and process them asynchronously

Agent Runner

  • Technology: Node.js TypeScript sidecar
  • Port: 8765 (internal)
  • Purpose: Executes agent workflows on behalf of the Services API

The runner receives /run requests from the Services API (routed via AGENTA_RUNNER_INTERNAL_URL) and starts harness processes (Pi, Claude Code, or other supported adapters) in local or remote sandboxes. It mounts durable working directories from the store into each sandbox and relays server-side tools back to the Services API without exposing the full stack environment to the harness.

Sandbox matrix:

  • local — in-process on the runner host; the default for compose and Kubernetes deployments.
  • daytona — a remote Daytona cloud sandbox; requires SANDBOX_AGENT_PROVIDER=daytona on the runner.

See Deploy the agent runner.

Services Backend

Services API (FastAPI)

  • Technology: Python, FastAPI
  • Port: 8080 (internal)
  • Purpose: LLM-facing endpoints and service-layer APIs exposed under /services/*

Key Responsibilities:

  • LLM Integration: Connects to various LLM providers (OpenAI, Anthropic, etc.)
  • Prompt Processing: Handles prompt templates and variable substitution
  • Response Generation: Manages LLM API calls and response handling
  • Provider Abstraction: Unified interface across different LLM providers
  • Error Handling: Robust error handling for LLM API failures
  • Endpoint Groups: Includes /services/completion/* and /services/chat/*

Infrastructure Services

PostgreSQL (Database)

  • Technology: PostgreSQL 17
  • Port: 5432
  • Purpose: Primary data storage

Databases:

  • Core Database: Application data, Datasets, Evaluations, Users & Profiles, etc.
  • Tracing Database: Execution traces and performance metrics
  • SuperTokens Database: Authentication and user management data

Redis (Task Queue, Caching & Sessions)

  • Technology: Redis
  • Ports: 6379 (volatile), 6381 (durable)
  • Purpose: Task queue, caching, pub/sub, streams

Use Cases:

  • Task Queue: TaskIQ broker for background job distribution and processing
  • Application Caching: Frequently accessed data
  • Session Storage: User sessions and temporary data
  • Task Results: TaskIQ task results and status
  • Real-time Data: Live updates and notifications
  • Rate Limiting: API rate limit counters

SuperTokens (Authentication)

  • Technology: SuperTokens
  • Port: 3567
  • Purpose: Authentication and user management

Features:

  • User Authentication: Login/logout, password management
  • Session Management: Secure session handling with JWT
  • OAuth Integration: Google, and GitHub
  • User Management: User registration, profile management

Durable Store (SeaweedFS / S3)

  • Technology: SeaweedFS (bundled) or any S3-compatible store (AWS S3, Cloudflare R2, MinIO)
  • Port: 8333 (bundled SeaweedFS)
  • Purpose: S3-compatible object store backing durable agent workspaces

Files written during an agent run are stored here and remounted automatically on the next turn, so agent workspaces survive sandbox teardown.

The store.seaweedfs.enabled Helm toggle controls whether the chart bundles a SeaweedFS StatefulSet or points store.endpointUrl at an external store. This mirrors the postgresql.enabled pattern. The endpoint URL is always explicit; a remote S3-compatible store (AWS, MinIO) must set it.

Per-deployment default:

  • Dev compose: SeaweedFS container bundled.
  • Railway: SeaweedFS service and volume (publicly reachable, no tunnel needed).
  • Kubernetes (gh self-host): no bundled SeaweedFS; supply external S3 credentials via store.* values.
  • Kubernetes (operator choice): enable via store.seaweedfs.enabled=true.
  • Live / private cloud: external AWS S3 (store.seaweedfs.enabled=false).

See the Store configuration reference.

Service Dependencies

Frontend Dependencies

Web UI depends on:
├── API Service (primary backend)
├── Services API (playground and model calls)
└── Authentication (SuperTokens via API)

Backend Dependencies

API Service depends on:
├── PostgreSQL (data persistence)
├── Redis (task queue, caching, sessions)
├── SuperTokens (authentication)
└── Worker pool (async task execution)

Services API depends on:
├── PostgreSQL (agent and service state)
├── LLM providers (model calls)
└── runner sidecar (agent workflow execution via AGENTA_RUNNER_INTERNAL_URL)

Worker Dependencies

Worker pool depends on:
├── Redis (queues and streams)
├── PostgreSQL (state and persistence)
├── API backend (coordination and config)
├── worker-streams (records loop: streams:records → session persistence)
└── Services API / external endpoints (workload-specific processing)