RAG Implementation with Supabase 🔍
Overview
This template provides an intelligent document search and query system using Retrieval Augmented Generation (RAG) with Supabase and OpenAI. Think of it as giving your documents a smart brain that can understand and answer questions about their content!
Main Features:
- Smart Document Processing: Automatically chunks and processes your documents for optimal retrieval
- AI-Powered Search: Combines keyword and semantic search for superior results
- Intelligent Responses: Generates contextual answers using GPT models
- Scalable Storage: Leverages Supabase for efficient document and vector storage
- Hybrid Search: Combines traditional and semantic search for better results
This template contains below two templates
- Add Document Chunks: Handles document upload, chunking and embedding generation using Open AI.
- RAG using Supabase: Processes user queries using Supabase Hybrid Search and generates contextual responses using Open AI Text Generator
✅ Prerequisites
⚙️ Setup
Step 1: Database Configuration
-- Enable vector support
CREATE EXTENSION IF NOT EXISTS vector;
-- Create files table
CREATE TABLE files (
"id" UUID DEFAULT gen_random_uuid() PRIMARY KEY,
"createdAt" TIMESTAMP WITH TIME ZONE DEFAULT timezone('utc'::text, now()),
"size" NUMERIC,
"mimeType" TEXT,
"encoding" TEXT,
"originalName" TEXT,
"downloadUrl" TEXT
);
-- Create chunks table
CREATE TABLE chunks (
"id" TEXT PRIMARY KEY,
"fileId" TEXT,
"extractedText" TEXT,
"embedding" vector(1536),
"fts" TSVECTOR GENERATED ALWAYS AS (to_tsvector('english', "extractedText")) STORED
);
-- Index for full-text search
CREATE INDEX ON chunks USING GIN ("fts");
Step 2: Add API Keys
1. Add your Supabase API integration key and API URL to Supabase node
Click on the key icon > Add Key
2. Add your OpenAl API secret key, Supabase API integration key and Supabase API URL to the Store Vector Embeddings node.
Click on the key icon > Add Key
Step 3: Deploy Search Function
Replace chunks
with the name of the table you are working with.
-- Add the hybrid search function to your Supabase database
CREATE OR REPLACE FUNCTION hybrid_search(
query_text TEXT,
query_embedding VECTOR(1536),
match_count INT,
full_text_weight FLOAT = 1,
semantic_weight FLOAT = 1,
rrf_k INT = 50
)
RETURNS SETOF chunks
LANGUAGE SQL
AS $$
WITH full_text AS (
SELECT "id", ROW_NUMBER() OVER (ORDER BY ts_rank_cd("fts", websearch_to_tsquery('english', query_text)) DESC) AS rank_ix
FROM chunks
WHERE "fts" @@ websearch_to_tsquery('english', query_text)
LIMIT LEAST(match_count, 30) * 2
),
semantic AS (
SELECT "id", ROW_NUMBER() OVER (ORDER BY "embedding" <#> query_embedding) AS rank_ix
FROM chunks
LIMIT LEAST(match_count, 30) * 2
)
SELECT chunks.*
FROM full_text
FULL OUTER JOIN semantic ON full_text."id" = semantic."id"
JOIN chunks ON COALESCE(full_text."id", semantic."id") = chunks."id"
ORDER BY
COALESCE(1.0 / (rrf_k + full_text.rank_ix), 0.0) * full_text_weight +
COALESCE(1.0 / (rrf_k + semantic.rank_ix), 0.0) * semantic_weight
DESC
LIMIT LEAST(match_count, 30);
$$;
🧪 Testing
1. Add Document Chunks Workflow
Send a POST request to the document processing endpoint:
curl -X POST https://your-buildship-url/add-document-chunks \
-H "Content-Type: multipart/form-data" \
-F "file=@sample.pdf"
The workflow will:
- Upload the document
- Split it into chunks
- Generate embeddings
- Store everything in Supabase
2. RAG using Supabase Workflow
Click on the Test button at the top
- Click "Test" on the RAG workflow
- Enter a test query:
{
"userQuery": "What are the main topics discussed in the document?"
}
The workflow will:
- Generate embeddings for the query
- Perform hybrid search to find relevant chunks
- Generate a contextual response using GPT
🎨 Customization
1. Search Parameters
Adjust these in the RAG workflow:
{
"fullTextWeight": 1.0, // Importance of keyword matching
"semanticWeight": 1.0, // Importance of semantic matching
"matchCount": 10, // Number of chunks to retrieve
"rrfK": 50 // Ranking factor
}
2. Document Processing
Configure in the document processing workflow:
{
"chunkSize": 1000, // Characters per chunk
"overlapSize": 100, // Overlap between chunks
"model": "text-embedding-ada-002" // Embedding model
}
📙 Example:
Paris Travel Guide PDF 🌍
Use this sample PDF : Paris Travel Guide
This demonstrates how RAG can enhance the travel experience. Uploaded and chunked into Supabase, this guide allows travelers to ask questions like:
- "What are the top attractions in Paris?"
- "How does tipping work in Parisian restaurants?"
- "What is the best time of year to visit Paris?"
With RAG, these queries are processed, and the system retrieves and generates detailed answers based on the Paris Travel Guide content.
💡 Use Cases
- Document Q&A Systems
{
"userQuery": "What are the key findings in section 3?"
}
- Research Analysis
{
"userQuery": "Summarize the methodology used in this paper"
}
- Knowledge Base Search
{
"userQuery": "Find all references to security protocols"
}
🔗 Resources