Back to Blog

Performance Tuning nfyio: PostgreSQL, Redis, and SeaweedFS

Optimize every layer of the nfyio stack — PostgreSQL query performance, Redis memory management, SeaweedFS throughput, and embedding pipeline speed.

n

nfyio Team

Talya Smart & Technoplatz JV

Performance Tuning nfyio

A fresh nfyio deployment works well out of the box. At scale — millions of objects, thousands of embeddings per hour, concurrent RAG queries — you need to tune. This guide covers the high-impact optimizations for each layer of the stack.

PostgreSQL Tuning

PostgreSQL handles metadata, RLS policies, and pgvector embeddings. It’s the most critical component to tune.

Connection Pooling

Default max_connections = 100 is too low for production. Use PgBouncer instead of increasing connections directly:

# pgbouncer.ini
[databases]
nfyio = host=localhost port=5432 dbname=nfyio

[pgbouncer]
listen_port = 6432
listen_addr = 0.0.0.0
auth_type = md5
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 50
reserve_pool_size = 10

Connect your nfyio gateway to PgBouncer:

DATABASE_URL=postgresql://nfyio:password@localhost:6432/nfyio

Memory Configuration

For an 8 GB RAM server dedicated to PostgreSQL:

# postgresql.conf
shared_buffers = 2GB
effective_cache_size = 6GB
work_mem = 64MB
maintenance_work_mem = 512MB
wal_buffers = 64MB

For a 32 GB RAM server:

shared_buffers = 8GB
effective_cache_size = 24GB
work_mem = 256MB
maintenance_work_mem = 2GB
wal_buffers = 128MB

pgvector Index Tuning

Embedding search is the most expensive query. Tune the HNSW index:

-- Check current index
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'embeddings';

-- Drop and recreate with optimized parameters
DROP INDEX IF EXISTS embeddings_vector_idx;

CREATE INDEX embeddings_vector_idx
ON embeddings
USING hnsw (embedding vector_cosine_ops)
WITH (m = 24, ef_construction = 200);

-- Set search-time parameters
SET hnsw.ef_search = 100;
ParameterDefaultTunedEffect
m1624More connections per node, better recall
ef_construction64200Better index quality, slower build
ef_search40100Better search recall, slightly slower queries

Benchmark before and after:

EXPLAIN (ANALYZE, BUFFERS)
SELECT id, 1 - (embedding <=> '[0.1, 0.2, ...]'::vector) AS similarity
FROM embeddings
WHERE bucket_id = 'bucket_abc123'
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;

Vacuum and Analyze

-- Check bloat
SELECT schemaname, tablename,
  pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) AS total_size,
  n_dead_tup,
  last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 10;

-- Aggressive vacuum on embeddings table
VACUUM (VERBOSE, ANALYZE) embeddings;

Autovacuum tuning for write-heavy tables:

ALTER TABLE embeddings SET (
  autovacuum_vacuum_scale_factor = 0.02,
  autovacuum_analyze_scale_factor = 0.01,
  autovacuum_vacuum_cost_delay = 10
);

Redis Tuning

Redis handles job queues (embedding pipeline, agent tasks) and caching.

Memory Policy

# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru

Pipeline Optimization

For batch embedding jobs, use Redis pipelines to reduce round trips:

# Check queue lengths
redis-cli LLEN nfyio:embedding:queue
redis-cli LLEN nfyio:agent:queue

# Monitor commands per second
redis-cli INFO stats | grep instantaneous_ops_per_sec

Persistence Tuning

For job queues where data loss is tolerable:

# Disable AOF, use infrequent RDB snapshots
appendonly no
save 900 1
save 300 10

For critical queues:

appendonly yes
appendfsync everysec
no-appendfsync-on-rewrite yes

Connection Limits

maxclients 10000
tcp-backlog 511
timeout 300
tcp-keepalive 60

SeaweedFS Tuning

SeaweedFS handles the actual object storage. Tuning focuses on throughput and replication.

Volume Size

Default 30 GB volumes. For large deployments, increase:

weed master -volumeSizeLimitMB=100000

Concurrent Uploads

Increase filer concurrency:

weed filer -maxMB=256 -concurrentUploadLimitMB=512

Compaction

SeaweedFS leaves gaps when objects are deleted. Schedule compaction:

# Check volume status
curl -s http://localhost:9333/vol/status | jq '.Volumes[] | select(.DeleteCount > 100)'

# Compact a volume
curl "http://localhost:9333/vol/vacuum?garbageThreshold=0.3"

Read Cache

Enable filer read cache for frequently accessed objects:

weed filer -cacheDir=/tmp/seaweedfs-cache -cacheSizeMB=4096

Embedding Pipeline Tuning

Batch Size

Process embeddings in batches instead of one-by-one:

# Check current setting
curl -s http://localhost:7010/config | jq '.embedding'

# Update batch size
curl -X PATCH http://localhost:7010/config \
  -H "Authorization: Bearer $JWT" \
  -H "Content-Type: application/json" \
  -d '{"embedding": {"batch_size": 100, "max_concurrent": 10}}'

Model Selection

ModelDimensionsSpeedCostQuality
text-embedding-3-small1536Fast$0.02/1M tokensGood
text-embedding-3-large3072Medium$0.13/1M tokensBest
text-embedding-ada-0021536Fast$0.10/1M tokensLegacy

For most use cases, text-embedding-3-small gives the best speed/quality tradeoff.

Chunk Size

Optimal chunk size depends on content type:

Content TypeChunk SizeOverlap
Technical docs512 tokens50 tokens
Blog posts800 tokens100 tokens
Legal documents1024 tokens200 tokens
Code files256 tokens25 tokens

Benchmarking

Run a load test against your tuned nfyio instance:

# Upload throughput
for i in $(seq 1 100); do
  curl -s -X PUT http://localhost:7007/bucket/test-file-$i.txt \
    -H "Authorization: Bearer $JWT" \
    -d "test content $i" &
done
wait
echo "100 uploads completed"

# Search latency
time curl -s -X POST http://localhost:3000/api/v1/search \
  -H "Authorization: Bearer $JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "how to deploy kubernetes",
    "bucket": "production-data",
    "limit": 10
  }' | jq '.results | length'

Quick Reference

ComponentKey SettingDefaultRecommended
PostgreSQLshared_buffers128 MB25% of RAM
PostgreSQLwork_mem4 MB64-256 MB
PostgreSQLhnsw.ef_search40100
RedismaxmemoryNo limit2-4 GB
Redismaxmemory-policynoevictionallkeys-lru
SeaweedFSVolume size30 GB100 GB
SeaweedFSCache size04 GB
EmbeddingsBatch size10100

Key Takeaways

  • PgBouncer in front of PostgreSQL handles connection multiplexing far better than raising max_connections
  • HNSW index parameters (m, ef_construction, ef_search) have the biggest impact on semantic search performance
  • Redis memory policy should be allkeys-lru for cache workloads — never let it hit OOM
  • SeaweedFS volume compaction reclaims disk space after deletions
  • Batch embedding processing (100 items at a time) reduces API round trips and improves throughput
  • Always benchmark before and after tuning — measure, don’t guess

For monitoring your tuned setup, see the Prometheus & Grafana guide.

n

Written by

nfyio Team

Talya Smart & Technoplatz JV

Building the future of web design at Anti-Gravity. Passionate about creating beautiful, accessible experiences.