Semantic Search Over Your Files with nfyio and OpenAI

basefyio ships with a built-in, fully managed embedding pipeline. When you upload a document, nfyio can automatically:

Extract the text content
Chunk it into segments
Send each chunk to OpenAI or Voyage AI for embedding
Store the vectors in pgvector
Let you query by semantic meaning via the API or SDK

There is no vector database to run and no embedding worker to deploy — it all runs on the managed nfyio cloud.

Prerequisites

An nfyio project (launch one in 60 seconds)
The embedding provider enabled on your bucket (choose OpenAI or Voyage AI in the dashboard under basefyio → Embeddings)

1. Upload a document with embedding

The easiest way is the SDK — pass embed: true when you upload:

import { basefyio } from '@nfyio/sdk'

const { key, embeddingJobId } = await basefyio.storage.upload(
  'reports/q1-2026.pdf',
  file,
  { embed: true }
)

Prefer raw HTTP? Send the upload with the x-nfyio-embed header:

curl -X PUT "https://app.nfyio.com/storage/your-bucket/reports/q1-2026.pdf" \
  -H "x-nfyio-embed: true" \
  -H "Authorization: Bearer $YOUR_ACCESS_KEY" \
  --data-binary @report-q1-2026.pdf

The response includes an embedding job ID:

{
  "key": "reports/q1-2026.pdf",
  "size": 82340,
  "embeddingJobId": "emb_7XkV9mNpQR",
  "status": "queued"
}

2. Check embedding status

curl https://app.nfyio.com/api/jobs/emb_7XkV9mNpQR \
  -H "Authorization: Bearer $YOUR_JWT"

{
  "id": "emb_7XkV9mNpQR",
  "status": "completed",
  "chunks": 24,
  "model": "text-embedding-3-small",
  "dimensions": 1536,
  "completedAt": "2026-02-22T10:15:32Z"
}

3. Query by meaning

const results = await basefyio.vector.search({
  query: 'What were the revenue highlights in Q1 2026?',
  bucket: 'reports',
  topK: 5,
  threshold: 0.78,
})

Each result carries the matching chunk, its source object and a similarity score:

{
  "results": [
    {
      "chunkId": "chunk_001",
      "objectKey": "reports/q1-2026.pdf",
      "score": 0.934,
      "content": "Revenue grew 42% year-over-year, reaching $4.2M ARR...",
      "metadata": { "page": 3, "chunkIndex": 2 }
    }
  ],
  "totalResults": 5,
  "queryTime": 12
}

4. Ground an agent with the RAG runtime

For multi-step retrieval and a grounded answer, hand the question to the agent runtime:

const answer = await basefyio.vector.memory.context({
  prompt: 'Summarize the key risks mentioned in the Q1 2026 report',
  namespace: 'reports',
  topK: 5,
})

The runtime retrieves the relevant chunks, passes them as context to the model, returns a grounded answer, and logs the full execution trace for audit.

Summary

With nfyio, semantic search over your files is a 3-step operation:

Upload with embed: true
Wait for the embedding job to complete
Query via vector.search()

No vector database to provision, no embedding workers to babysit — the managed nfyio cloud does it all.