Vector DB + RAG Maker

python
langchain
huggingface
chromadb
Introducing a vector database and Retrieval-Augmented Generation (‘RAG’) system for coding in R, designed to provide bleeding-edge responses from curated documentation.
Author
Published

March 7, 2025

Boosting LLM Performance with Vector DB + RAG Systems

If you’re looking to create your own knowledge base from technical documentation, check out my newly released GitHub project. My Vector DB and RAG Maker tool offers a straightforward yet powerful method to process, store, and query documentation using vector embeddings and Retrieval-Augmented Generation (“RAG”). In my example, I apply it to recent R documentation and tutorials, making it an ideal solution for individuals and enterprises alike.

Benefits of Vector DB + RAG vs Direct LLM Querying

Let’s say you’re working with specialized documentation (e.g., 30 books on the topics of the R programming language, applied statistics, and machine learning) and you wish your LLM responses had access to this curated information… The benefits of using a vector database with RAG, instead of directly querying an LLM, are substantial:

1. Significantly Lower API Costs

Direct LLM queries with large context windows can get expensive, quickly, especially when working with extensive documentation. The vector DB approach only sends the most relevant chunks to the LLM, drastically reducing token usage and API costs.

2. Superior Query Performance

RAG systems deliver faster responses given that:

  • The vector search quickly identifies the most relevant content
  • The LLM only processes small, highly relevant chunks rather than entire documents
  • The system caches embeddings, making subsequent similar queries even faster

3. Scalability

As your collection grows (imagine adding more specialized packages or books to your library), a vector DB scales gracefully. The search performance remains consistent whether you have 3 books or 300, while direct LLM approaches would require increasingly larger context windows.

4. Higher Accuracy for Domain-Specific Questions

By retrieving specific, relevant context from your documentation, RAG systems provide more precise answers tailored to your exact versions and implementations rather than the LLM’s general knowledge.

Try It Out!

I’ve designed this system to be as straightforward as possible to setup and use. You can have it running with your own documentation in just a few minutes:

git clone JavOrraca/Vector-DB-and-RAG-Maker
cd Vector-DB-and-RAG-Maker
pip install -r requirements.txt

After setting your API key, you’re ready to ingest your R related files and start querying. Check out the full project on GitHub to see how you can transform your technical documentation into an intelligent knowledge base that provides bleeding-edge responses based on your exact resources.

About this Tool

This tool was built in approximately 90 minutes with Anthropic’s Claude 3.7 Sonnet and Claude Code. My personal computer is a 2023 MacBook Air, so nothing fancy. I interacted with Claude Code via the macOS Terminal and I incurred $2.28 in API calls to build this entire tool. I consider this first draft of the tool a huge success given the low amount of effort to develop. Please feel free to reach out to me directly on LinkedIn or file a GitHub Issue if you encounter a bug or have feature requests.

The code for this tool was written in Python and makes use of LangChain, Hugging Face, Chroma (for the vector database), and the CLI tool expects you to supply an Anthropic API key in order to hit Claude 3.7 Sonnet as the external LLM. If you’re not quite set up yet for Anthropic API billing, here is some information on Anthropic API pricing and an FAQ on their API billing.

Perhaps fine-tuning an LLM is the better long-term solution. There are similar tools out there in the wild, this one is just particularly easy for me to follow and hopefully the same for other R and Python programmers. Eventually, I’d like to expand this tool as a general purpose vector DB + RAG maker, but for now, happy Friday, and happy coding!