Deploy with Kubernetes

This guide covers deploying NFYio on Kubernetes using the official Helm chart. You’ll configure values, scale workloads, manage rolling updates, set up Prometheus monitoring, and handle persistent volumes and secrets.

Prerequisites

Kubernetes cluster 1.24+ (EKS, GKE, AKS, or self-managed)
kubectl configured for your cluster
Helm 3.8+
Basic understanding of Kubernetes concepts (Pods, Services, PVCs, Secrets)

Verify your setup:

kubectl cluster-info
helm version

Helm Chart Installation

Add the NFYio Helm Repository

helm repo add nfyio https://charts.nfyio.io
helm repo update

Install with Default Values

helm install nfyio nfyio/nfyio --namespace nfyio --create-namespace

Install with Custom Values

helm install nfyio nfyio/nfyio -f values.yaml --namespace nfyio --create-namespace

values.yaml Configuration

Create a values.yaml to override defaults. Key sections:

# values.yaml - NFYio Helm Chart Configuration

# ── Global ─────────────────────────────────────────
global:
  imageRegistry: ""  # Use default; set for private registry
  imagePullSecrets: []

# ── Gateway (API, Auth, Dashboard) ───────────────────
gateway:
  replicaCount: 2
  image:
    repository: nfyio/gateway
    tag: "0.9.0"
    pullPolicy: IfNotPresent
  resources:
    requests:
      cpu: 250m
      memory: 512Mi
    limits:
      cpu: 1000m
      memory: 1Gi
  env:
    - name: SESSION_SECRET
      valueFrom:
        secretKeyRef:
          name: nfyio-secrets
          key: session-secret
    - name: PUBLIC_URL
      value: "https://nfyio.yourdomain.com"
    - name: ALLOWED_ORIGINS
      value: "https://app.yourdomain.com"

# ── Storage Proxy (S3-compatible) ───────────────────
storage:
  replicaCount: 2
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: 500m
      memory: 512Mi

# ── Agent Service (RAG, LLM) ───────────────────────
agent:
  replicaCount: 1
  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      cpu: 2000m
      memory: 4Gi

# ── PostgreSQL ──────────────────────────────────────
postgresql:
  enabled: true
  auth:
    username: nfyio
    password: ""  # Set via --set or secret
    database: nfyio
  primary:
    persistence:
      enabled: true
      size: 20Gi
      storageClass: ""  # Use default StorageClass

# ── Redis ───────────────────────────────────────────
redis:
  enabled: true
  auth:
    enabled: true
    existingSecret: nfyio-secrets
    existingSecretPasswordKey: redis-password
  master:
    persistence:
      enabled: true
      size: 8Gi

# ── SeaweedFS ───────────────────────────────────────
seaweedfs:
  master:
    persistence:
      enabled: true
      size: 10Gi
  volume:
    replicaCount: 1
    persistence:
      enabled: true
      size: 100Gi  # Adjust for object storage capacity

# ── Ingress ──────────────────────────────────────────
ingress:
  enabled: true
  className: nginx
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/proxy-body-size: "100m"
  hosts:
    - host: nfyio.yourdomain.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: nfyio-tls
      hosts:
        - nfyio.yourdomain.com

# ── Service Monitor (Prometheus) ───────────────────
serviceMonitor:
  enabled: true
  interval: 30s

Secrets Management

Create Secrets Manually

kubectl create namespace nfyio

kubectl create secret generic nfyio-secrets \
  --namespace nfyio \
  --from-literal=session-secret=$(openssl rand -hex 64) \
  --from-literal=postgres-password=$(openssl rand -base64 24) \
  --from-literal=redis-password=$(openssl rand -base64 24) \
  --from-literal=keycloak-admin-password=$(openssl rand -base64 24) \
  --from-literal=openai-api-key=sk-your-key-here

Use External Secrets Operator (Optional)

If using External Secrets Operator with AWS Secrets Manager or Vault:

# external-secret.yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: nfyio-secrets
  namespace: nfyio
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets-manager
    kind: ClusterSecretStore
  target:
    name: nfyio-secrets
  data:
    - secretKey: session-secret
      remoteRef:
        key: nfyio/production
        property: session_secret
    - secretKey: postgres-password
      remoteRef:
        key: nfyio/production
        property: postgres_password

Install with Secrets

helm install nfyio nfyio/nfyio -f values.yaml \
  --set postgresql.auth.existingSecret=nfyio-secrets \
  --set postgresql.auth.secretKeys.userPasswordKey=postgres-password \
  --namespace nfyio \
  --create-namespace

Scaling

Horizontal Pod Autoscaler (HPA)

Add HPA for the gateway and storage proxy:

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: nfyio-gateway
  namespace: nfyio
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nfyio-gateway
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: nfyio-storage
  namespace: nfyio
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nfyio-storage
  minReplicas: 2
  maxReplicas: 6
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Apply:

kubectl apply -f hpa.yaml

Manual Scaling

kubectl scale deployment nfyio-gateway -n nfyio --replicas=4
kubectl scale deployment nfyio-storage -n nfyio --replicas=3

Rolling Updates

Update Image Version

# Update to new version
helm upgrade nfyio nfyio/nfyio -f values.yaml \
  --set gateway.image.tag=0.10.0 \
  --set storage.image.tag=0.10.0 \
  --set agent.image.tag=0.10.0 \
  --namespace nfyio

Configure Rolling Update Strategy

In values.yaml:

gateway:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0

Monitor Rollout

kubectl rollout status deployment/nfyio-gateway -n nfyio
kubectl rollout history deployment/nfyio-gateway -n nfyio

# Rollback if needed
kubectl rollout undo deployment/nfyio-gateway -n nfyio

Persistent Volumes

Storage Classes

Ensure your cluster has a default StorageClass or specify one:

postgresql:
  primary:
    persistence:
      storageClass: gp3  # AWS EBS gp3
      size: 50Gi

redis:
  master:
    persistence:
      storageClass: gp3
      size: 20Gi

seaweedfs:
  volume:
    persistence:
      storageClass: gp3
      size: 500Gi  # For object storage

Backup PVCs with Velero

# Install Velero (example for AWS)
velero install \
  --provider aws \
  --bucket nfyio-backups \
  --backup-location-config region=us-east-1 \
  --snapshot-location-config region=us-east-1

# Create backup
velero backup create nfyio-backup --include-namespaces nfyio

Monitoring with Prometheus

Enable ServiceMonitor

Ensure the chart has serviceMonitor.enabled: true and Prometheus Operator is installed.

PrometheusRule for Alerts

# prometheus-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: nfyio-alerts
  namespace: nfyio
  labels:
    prometheus: kube-prometheus
spec:
  groups:
    - name: nfyio
      rules:
        - alert: NfyioGatewayDown
          expr: up{job="nfyio-gateway"} == 0
          for: 2m
          labels:
            severity: critical
          annotations:
            summary: "NFYio Gateway is down"
        - alert: NfyioStorageHighLatency
          expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket{job="nfyio-storage"}[5m])) > 1
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "Storage proxy P95 latency > 1s"
        - alert: NfyioAgentHighMemory
          expr: container_memory_usage_bytes{container="nfyio-agent"} / container_spec_memory_limit_bytes{container="nfyio-agent"} > 0.9
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "Agent service memory usage > 90%"

Apply:

kubectl apply -f prometheus-rules.yaml

Grafana Dashboard

Import the NFYio dashboard (JSON) from the chart or create panels for:

Request rate (gateway, storage, agent)
Error rate (4xx, 5xx)
Latency percentiles (P50, P95, P99)
Resource usage (CPU, memory per pod)

Full Installation Example

# 1. Create namespace and secrets
kubectl create namespace nfyio
kubectl create secret generic nfyio-secrets -n nfyio \
  --from-literal=session-secret=$(openssl rand -hex 64) \
  --from-literal=postgres-password=$(openssl rand -base64 24) \
  --from-literal=redis-password=$(openssl rand -base64 24) \
  --from-literal=keycloak-admin-password=$(openssl rand -base64 24) \
  --from-literal=openai-api-key=sk-your-key

# 2. Install Helm chart
helm install nfyio nfyio/nfyio -f values.yaml \
  --namespace nfyio \
  --wait \
  --timeout 10m

# 3. Verify
kubectl get pods -n nfyio
kubectl get svc -n nfyio

# 4. Run migrations
kubectl exec -it deployment/nfyio-gateway -n nfyio -- deno task migrate

Troubleshooting

Issue	Solution
ImagePullBackOff	Check image name, registry, and imagePullSecrets
CrashLoopBackOff	Check logs: `kubectl logs -f deployment/nfyio-gateway -n nfyio`
PVC pending	Verify StorageClass exists and has provisioner
Secret not found	Ensure secret exists and keys match chart expectations
Ingress 502	Check backend service and pod readiness

What’s Next

Monitoring & Observability — Grafana dashboards and alerting
Set Up VPC Networking — Network isolation and security groups
Deploy with Docker — Simpler single-node deployment