Architecture Overview

Understand NFYio system architecture: API Gateway, Storage Service, Embedding Service, Agent Runtime, Auth, Database, and Cache.

NFYio is built as a modular, horizontally scalable platform. This document describes the system architecture, components, data flows, and deployment topology.

System Architecture Diagram

                              ┌─────────────────┐
                              │   Clients       │
                              │ (SDK, CLI, UI)  │
                              └────────┬────────┘

                    ┌──────────────────┼──────────────────┐
                    │                  │                  │
                    ▼                  ▼                  ▼
            ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
            │ API Gateway  │  │   Storage    │  │    Agent     │
            │  (port 3000) │  │ Proxy (7007) │  │ Service 7010 │
            │              │  │   (S3 API)   │  │  (RAG, LLM)  │
            └──────┬───────┘  └──────┬───────┘  └──────┬───────┘
                   │                 │                  │
                   │    ┌────────────┼────────────┐     │
                   │    │            │            │     │
                   ▼    ▼            ▼            ▼     ▼
            ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
            │   Keycloak   │  │  PostgreSQL  │  │   SeaweedFS  │
            │   (Auth)     │  │  + pgvector  │  │   (Blobs)    │
            └──────────────┘  └──────┬───────┘  └──────────────┘


                            ┌──────────────┐
                            │    Redis     │
                            │ (Cache/Queue)│
                            └──────────────┘

Components

API Gateway

The main entry point for NFYio. Handles:

  • Authentication — JWT validation, session management, MFA
  • Team & workspace management — RBAC, project hierarchy
  • Billing — Stripe, PayPal, usage metering
  • Orchestration — Routes requests to storage, agent, and database services

Built with Deno + TypeScript. Runs on port 3000 by default.

Storage Service (SeaweedFS)

S3-compatible object storage backed by SeaweedFS. Provides:

  • Object CRUD — Put, get, delete, list objects
  • Bucket management — Create, configure, delete buckets
  • Versioning — Optional object versioning
  • Lifecycle — Expiration and transition policies

The storage proxy translates S3 API calls to SeaweedFS operations. Runs on port 7007.

Embedding Service

Part of the gateway and agent stack. Responsible for:

  • Document ingestion — Parse PDFs, docs, images
  • Embedding generation — OpenAI or Voyage AI models
  • Vector storage — pgvector in PostgreSQL for similarity search

Embeddings are generated asynchronously via job queues (Redis) and stored for fast retrieval.

Agent Runtime

Multi-step agentic workflows with:

  • RAG — Retrieve relevant chunks from embeddings, augment LLM context
  • LLM — GPT-4o and other models via OpenAI-compatible APIs
  • Workflow — LangChain-based pipelines with policy gateway

Runs on port 7010. Communicates with the gateway for auth and with PostgreSQL for embeddings.

Auth Service (Keycloak)

Identity and access management:

  • User management — Registration, profiles, MFA
  • OAuth2/OIDC — JWT issuance, token refresh
  • Realm configuration — Multi-tenant realms, custom themes

NFYio uses Keycloak as the identity provider. All authenticated requests carry a valid JWT.

Database (PostgreSQL + pgvector)

Central data store for:

  • Metadata — Buckets, objects, workspaces, projects, teams
  • Embeddings — Vector representations for semantic search (pgvector)
  • Audit logs — Access and change history
  • Billing — Usage records, subscriptions

PostgreSQL 15+ with the pgvector extension is required.

Cache (Redis)

Used for:

  • Sessions — Encrypted session storage
  • Job queues — Embedding jobs, async tasks
  • Caching — Frequently accessed metadata

Data Flow

Object Upload Flow

  1. Client sends PUT to storage proxy (S3 API)
  2. Storage proxy validates JWT (via gateway or direct validation)
  3. Proxy writes object to SeaweedFS volume
  4. Metadata (bucket, key, size, checksum) is written to PostgreSQL
  5. If embedding is enabled, a job is queued in Redis for async processing

RAG Query Flow

  1. Client sends query to agent service
  2. Agent validates JWT and fetches user/workspace context
  3. Query is embedded using OpenAI/Voyage AI
  4. pgvector similarity search returns top-k chunks
  5. Chunks are passed to LLM as context
  6. LLM generates response; agent returns to client

Authentication Flow

  1. User logs in via Keycloak (OAuth2/OIDC)
  2. Keycloak issues JWT
  3. Client includes JWT in Authorization header
  4. Gateway and services validate JWT and extract user/role
  5. RLS and application logic enforce access control

Deployment Topology

Single-Node (Development)

All services run on one host via Docker Compose. Suitable for development and testing.

Multi-Node (Production)

For production, components can be distributed:

  • Gateway — Multiple replicas behind a load balancer
  • Storage proxy — Multiple replicas; SeaweedFS handles distribution
  • Agent service — Scale horizontally for RAG/LLM load
  • PostgreSQL — Primary + replicas for read scaling
  • Redis — Cluster or Sentinel for HA
  • SeaweedFS — Multiple masters and volumes for HA and capacity
  • Keycloak — Clustered for HA

Kubernetes

Helm charts support:

  • Horizontal Pod Autoscaling for gateway, storage, and agent
  • Persistent volumes for PostgreSQL and SeaweedFS
  • Ingress for TLS termination and routing
  • ConfigMaps and Secrets for configuration

Technology Stack Summary

ComponentTechnology
RuntimeDeno + TypeScript
Main DBPostgreSQL + pgvector
Session storeRedis
Blob storageSeaweedFS
AuthenticationKeycloak
AI pipelineLangChain + OpenAI / Voyage AI
FrontendNext.js (dashboard) + Astro (marketing)