Core Concepts
Learn NFYio core concepts: buckets, objects, access keys, agents, embeddings, VPCs, workspaces, and roles.
This document introduces the key concepts in NFYio: storage primitives, AI agents, embeddings, networking, and access control.
Buckets
A bucket is a top-level container for objects. It is the S3-compatible equivalent of a folder or namespace.
- Bucket names must be globally unique within your NFYio deployment
- You can create buckets via the API, AWS CLI, or dashboard
- Buckets support versioning, lifecycle policies, and access control
# Create a bucket
aws --endpoint-url http://localhost:7007 s3 mb s3://my-documents
# List buckets
aws --endpoint-url http://localhost:7007 s3 ls
Objects
An object is a file stored in a bucket. Each object has:
- Key — Path-like identifier (e.g.,
documents/report.pdf) - Size — Size in bytes
- Content-Type — MIME type
- Metadata — Custom key-value pairs
- Version ID — If versioning is enabled
Objects are stored in SeaweedFS and metadata in PostgreSQL.
# Upload an object
aws --endpoint-url http://localhost:7007 s3 cp report.pdf s3://my-documents/reports/
# Download an object
aws --endpoint-url http://localhost:7007 s3 cp s3://my-documents/reports/report.pdf ./
Access Keys
Access keys are credentials for programmatic access to NFYio storage. Each key has:
- Access Key ID — Public identifier
- Secret Access Key — Private secret (store securely)
Use them with AWS SDKs or the aws CLI by configuring the endpoint and credentials. Keys are scoped to a workspace or project and respect RBAC.
# Configure AWS CLI for NFYio
aws configure set aws_access_key_id YOUR_ACCESS_KEY
aws configure set aws_secret_access_key YOUR_SECRET_KEY
aws configure set default.region us-east-1
# Use with endpoint
aws --endpoint-url http://localhost:7007 s3 ls
Agents
Agents are AI-powered components that process data and respond to queries.
RAG Agent
Retrieval-Augmented Generation — Ingest documents, generate embeddings, and answer questions using your data.
- Documents are chunked and embedded
- Queries are embedded and matched via vector similarity
- Relevant chunks are passed to the LLM as context
- The LLM generates answers grounded in your corpus
LLM Agent
Large Language Model — Direct interaction with models (e.g., GPT-4o) for chat, summarization, or generation. Can be combined with RAG for knowledge-grounded responses.
Workflow Agent
Multi-step workflows — Chain multiple steps (retrieve → reason → act) using LangChain. Supports tools, policy checks, and branching logic.
Embeddings and Vector Search
Embeddings
Embeddings are dense vector representations of text. NFYio uses OpenAI or Voyage AI to convert:
- Document chunks → vectors stored in pgvector
- User queries → vectors used for similarity search
Vector Search
Vector search finds the most similar chunks to a query vector using cosine similarity or L2 distance in pgvector. This enables semantic search: “find documents about X” without exact keyword matches.
-- Conceptual: find similar chunks (actual API differs)
SELECT id, content, embedding <=> query_embedding AS distance
FROM document_chunks
ORDER BY distance
LIMIT 10;
VPCs, Subnets, Security Groups
NFYio supports virtual networking for multi-tenant isolation:
VPC (Virtual Private Cloud)
A VPC is an isolated network segment. Resources in a VPC can communicate privately; traffic to/from the internet is controlled.
Subnets
Subnets are subdivisions of a VPC. They define IP ranges and can be public or private.
Security Groups
Security groups act as firewalls. They define which inbound and outbound traffic is allowed for resources (e.g., agent runtimes, storage access).
Workspaces, Projects, Teams
NFYio organizes resources in a hierarchy:
Workspace
A workspace is the top-level tenant. It typically represents an organization or customer. All projects and teams belong to a workspace.
Project
A project groups related resources (buckets, agents, datasets) within a workspace. Use projects to separate environments (e.g., dev, staging, prod) or product lines.
Team
A team is a group of users with shared access to projects. Teams have roles that determine what members can do (read, write, admin).
Workspace (Acme Corp)
├── Project (Document AI)
│ ├── Bucket: documents
│ ├── RAG Agent: doc-qa
│ └── Team: doc-team (read/write)
└── Project (Analytics)
├── Bucket: raw-data
└── Team: analytics-team (read)
Roles and Permissions
Roles
Roles define what a user or team can do:
| Role | Typical Permissions |
|---|---|
viewer | Read buckets, objects, agents; run queries |
editor | Viewer + upload, delete objects; create agents |
admin | Editor + manage buckets, teams, settings |
owner | Full control over workspace/project |
Permissions
Permissions are enforced at multiple layers:
- Keycloak — Authentication and user attributes
- NFYio Gateway — JWT validation, workspace/project membership
- Storage Proxy — Bucket and object access checks
- Agent Service — Workspace-scoped agent access
- PostgreSQL RLS — Row-level security for metadata
Best Practices
- Use the principle of least privilege: assign the minimum role needed
- Prefer team-based access over individual grants
- Rotate access keys periodically
- Enable audit logging for sensitive operations