Getting Started
The Knowledge Base
In order to get started with RAG, one must first have a knowledge base.
I visited the Uganda Legal Information Institute (ULII), which is a free legal information service provided by the Law Reporting Unit of the Uganda Judiciary and a member of the global Free Access to Law community.
Through ULII, I've gathered the following key legal documents to serve as the knowledge base for my demo:
The Constitution of the Republic of Uganda (1995, as amended) The Landlord and Tenant Act (2022)
The Storage Journey
Once I had embeddings working, I needed to store them. My first instinct: keep everything in memory.
const lawEmbeddings: LawEmbedding[] = [];
// Generate once
const embeddings = await generateLawEmbeddings();
lawEmbeddings.push(...embeddings);
// Use forever (or until the process restarts)
This worked! But then I thought: "What happens when I restart my application?"
Storage Evolution
Stage 1: In-Memory Array
const lawEmbeddings = [];
// Fast, simple, but gone when process restarts
Stage 2: JSON File
// Save embeddings
fs.writeFileSync("embeddings.json", JSON.stringify(lawEmbeddings));
// Load embeddings
const lawEmbeddings = JSON.parse(fs.readFileSync("embeddings.json"));
Now my embeddings persist! But I noticed something: for 5 laws with 384-dimensional embeddings, my JSON file was already 30KB. What if I had 10,000 laws?
Stage 3: The Precomputation Realization
Here's what I figured out:
Precomputation Phase (do this once):
- Take each law's text → embedding model → 384D vector
- Normalize all vectors to unit length
- Store vectors with their metadata
Query Phase (do this every search):
- User asks question → embedding model → 384D vector
- Compare against stored embeddings
- Return top matches
The expensive part (generating embeddings) happens once. Searching is just math on pre-computed vectors.
Storage Decision Framework
I developed a mental model:
Small dataset (< 10K items):
- In-memory arrays work great
- Load from JSON on startup
- Total embedding data: < 50MB
- Search time: milliseconds
Medium dataset (10K - 100K items):
- JSON gets unwieldy
- Consider SQLite with blob storage
- Or simple file-based caching
- Search time: seconds
Large dataset (100K+ items):
- Need proper vector database
- Specialized indexes (HNSW, IVF)
- Distributed storage
- Search time: milliseconds (with indexing)
Why Vector Databases Do More
When I finally looked at how vector databases work, it all made sense:
ChromaDB, Pinecone, Weaviate aren't just storage. They're:
- Optimized storage formats - compressed vectors, efficient serialization
- Specialized indexes - HNSW, IVF for fast similarity search
- Query optimization - caching, pre-filtering, batch processing
- Metadata filtering - search within subsets (e.g., "only laws from 2020")
- Distributed scaling - split data across machines
But the core concept? Still just "store normalized vectors, compare with cosine similarity."
I could build demo products with JSON files and in-memory arrays. When I needed scale, I'd know exactly what problem I was solving by adding a vector database.