RAG using Supabase

This template provides a complete document QA system that combines Supabase vector storage with OpenAI's capabilities for intelligent document processing and querying. Using two workflows - document processing and RAG querying - it enables you to upload documents, automatically process them, and get AI-powered answers based on your document content.

346

Report this template

Select the reason for reporting

Describe the issue in detail

Share template

Link to template

https://templates.buildship.com/template/GN5uF2C_1i1e/

By BuildShip

This template contains 2 flows:

Inputs

File

Supabase API URL

********

This is a static example using sample inputs. Remix the template to run it with your own values.

Output

Read me

RAG Implementation with Supabase 🔍

Overview

This template provides an intelligent document search and query system using Retrieval Augmented Generation (RAG) with Supabase and OpenAI. Think of it as giving your documents a smart brain that can understand and answer questions about their content!

Main Features:

Smart Document Processing: Automatically chunks and processes your documents for optimal retrieval
AI-Powered Search: Combines keyword and semantic search for superior results
Intelligent Responses: Generates contextual answers using GPT models
Scalable Storage: Leverages Supabase for efficient document and vector storage
Hybrid Search: Combines traditional and semantic search for better results

This template contains below two templates

Add Document Chunks: Handles document upload, chunking and embedding generation using Open AI.
RAG using Supabase: Processes user queries using Supabase Hybrid Search and generates contextual responses using Open AI Text Generator

✅ Prerequisites

[ ] OpenAI account and API key. You can get it from your OpenAI developer dashboard.
[ ] Supabase account, API key and API URL. You can get it from your Supabase dashboard.

⚙️ Setup

Step 1: Database Configuration

-- Enable vector support
CREATE EXTENSION IF NOT EXISTS vector;

-- Create files table
CREATE TABLE files (
    "id" UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    "createdAt" TIMESTAMP WITH TIME ZONE DEFAULT timezone('utc'::text, now()),
    "size" NUMERIC,
    "mimeType" TEXT,
    "encoding" TEXT,
    "originalName" TEXT,
    "downloadUrl" TEXT
);

-- Create chunks table
CREATE TABLE chunks (
    "id" TEXT PRIMARY KEY,
    "fileId" TEXT,
    "extractedText" TEXT,
    "embedding" vector(1536),
    "fts" TSVECTOR GENERATED ALWAYS AS (to_tsvector('english', "extractedText")) STORED
);

-- Index for full-text search
CREATE INDEX ON chunks USING GIN ("fts");

Step 2: Add API Keys

1. Add your Supabase API integration key and API URL to Supabase node

Click on the key icon > Add Key

2. Add your OpenAl API secret key, Supabase API integration key and Supabase API URL to the Store Vector Embeddings node.

Click on the key icon > Add Key

Step 3: Deploy Search Function

Replace chunks with the name of the table you are working with.

-- Add the hybrid search function to your Supabase database
CREATE OR REPLACE FUNCTION hybrid_search(
  query_text TEXT,
  query_embedding VECTOR(1536),
  match_count INT,
  full_text_weight FLOAT = 1,
  semantic_weight FLOAT = 1,
  rrf_k INT = 50
)
RETURNS SETOF chunks
LANGUAGE SQL
AS $$
WITH full_text AS (
  SELECT "id", ROW_NUMBER() OVER (ORDER BY ts_rank_cd("fts", websearch_to_tsquery('english', query_text)) DESC) AS rank_ix
  FROM chunks
  WHERE "fts" @@ websearch_to_tsquery('english', query_text)
  LIMIT LEAST(match_count, 30) * 2
),
semantic AS (
  SELECT "id", ROW_NUMBER() OVER (ORDER BY "embedding" <#> query_embedding) AS rank_ix
  FROM chunks
  LIMIT LEAST(match_count, 30) * 2
)
SELECT chunks.*
FROM full_text
FULL OUTER JOIN semantic ON full_text."id" = semantic."id"
JOIN chunks ON COALESCE(full_text."id", semantic."id") = chunks."id"
ORDER BY
  COALESCE(1.0 / (rrf_k + full_text.rank_ix), 0.0) * full_text_weight +
  COALESCE(1.0 / (rrf_k + semantic.rank_ix), 0.0) * semantic_weight
DESC
LIMIT LEAST(match_count, 30);
$$;

🧪 Testing

1. Add Document Chunks Workflow

Send a POST request to the document processing endpoint:

curl -X POST https://your-buildship-url/add-document-chunks \
  -H "Content-Type: multipart/form-data" \
  -F "file=@sample.pdf"

The workflow will:

Upload the document
Split it into chunks
Generate embeddings
Store everything in Supabase

2. RAG using Supabase Workflow

Click on the Test button at the top

Click "Test" on the RAG workflow
Enter a test query:

{
  "userQuery": "What are the main topics discussed in the document?"
}

The workflow will:

Generate embeddings for the query
Perform hybrid search to find relevant chunks
Generate a contextual response using GPT

🎨 Customization

1. Search Parameters

Adjust these in the RAG workflow:

{
  "fullTextWeight": 1.0,    // Importance of keyword matching
  "semanticWeight": 1.0,    // Importance of semantic matching
  "matchCount": 10,         // Number of chunks to retrieve
  "rrfK": 50               // Ranking factor
}

2. Document Processing

Configure in the document processing workflow:

{
  "chunkSize": 1000,       // Characters per chunk
  "overlapSize": 100,      // Overlap between chunks
  "model": "text-embedding-ada-002"  // Embedding model
}

📙 Example:

Paris Travel Guide PDF 🌍

Use this sample PDF : Paris Travel Guide

This demonstrates how RAG can enhance the travel experience. Uploaded and chunked into Supabase, this guide allows travelers to ask questions like:

"What are the top attractions in Paris?"
"How does tipping work in Parisian restaurants?"
"What is the best time of year to visit Paris?"

With RAG, these queries are processed, and the system retrieves and generates detailed answers based on the Paris Travel Guide content.

💡 Use Cases

Document Q&A Systems

{
  "userQuery": "What are the key findings in section 3?"
}

Research Analysis

{
  "userQuery": "Summarize the methodology used in this paper"
}

Knowledge Base Search

{
  "userQuery": "Find all references to security protocols"
}