Skip to main content

Command Palette

Search for a command to run...

🧠 Demystifying RAG : A Theoretical Introduction

Updated
4 min read
🧠 Demystifying RAG : A Theoretical Introduction

So you’ve played around with ChatGPT and other LLMs, maybe even built a tool using OpenAI’s API. But when people throw around the term “RAG” (Retrieval-Augmented Generation), it feels like there’s some next-level thing happening.

Let’s simplify it. Here's the lowdown — dev-to-dev. 🤝


1️⃣ Simple LLM – Great at reasoning, but poor at data

LLMs like GPT-3.5 or GPT-4 are trained on a massive corpus of data till a certain point (e.g., 2024). When you ask them questions, they answer based on patterns and data they remember from training.

❌ But they don’t know anything about:

  • Real-time data

  • Your internal product documentation

  • A CSV you just uploaded

They’re like that friend who remembers everything from college but has zero clue about your current work.


2️⃣ Basic Agents – LLM + Tools, but context is volatile

Now let’s plug in tools — search APIs, file readers, calculators, etc. This becomes an Agent setup. The LLM decides:

  • What tool to use

  • In what sequence

  • How to format the input/output

Sounds cool? It is… until it isn’t.

⚠️ Context breaks fast.
With too many tools and steps, the LLM loses track of what’s been done already. You get hallucinations, redundant steps, or just broken logic. Some times, it even forgets the user's name! Ha-ha!


3️⃣ Let’s Talk About “Context”

Think of context as the shared memory across steps.

// User: What's my next meeting?
-> LLM: checks calendar
-> LLM: "You have a sync at 4PM with design team."

// User: Cancel it.
-> If context is gone: "What are you talking about?"
-> If context is intact: cancels the 4PM meeting

LLMs aren’t great at long-term memory out of the box, so unless you handle this properly (via prompt engineering or memory frameworks), things fall apart.


4️⃣ Enter RAG – Modular, scalable, context-preserving systems

RAG is more than just a technique - it's a complete system. To make it work, you need to understand the specific problem you're trying to solve and adapt it to that problem. This is similar to designing a system, where you need to understand the problem before you can create a good solution. Here’s what makes it different:

✅ System pipeline with multiple components -

  1. Preprocessor

    • Chunk large documents (PDFs, emails, websites)

    • Convert to vector embeddings

  2. Retriever

    • For every user query, find relevant chunks using similarity search (vector DBs like Pinecone, Qdrant, etc.)

    • Enrich query to gather sufficient context information

  3. Prompt Composer

    • Combine the retrieved context + user query into an intelligent prompt
  4. LLM Generator

    • Feed composed prompt to LLM and get smart output
  5. Output Validator

    • Judge, validate, self-correct on low confidence, etc.

🔄 Continuous Improvement

  • RAG setups can be improved incrementally: better chunking, smarter retrieval, context filtering, reranking, etc.

  • It’s like evolving software – push patches, make it better.

  • We can make RAG systems check if their own answer is good or not, and if it's not, they can try again with better context.


🧠 LLM vs Agent vs RAG – TL;DR for Devs

SetupWhat's in itProsCons
LLMJust the modelSimple Q&AStatic knowledge
AgentLLM + toolsMulti-step logicEasily loses context
RAGModular pipeline with retrievalAccurate, scalable, debuggableNeeds infra + effort

🔧 Here is an Analogy

  • LLM: A smart intern who read everything but has no internet access

  • Agent: That intern with Google and a calculator, but no notes

  • RAG: Intern + Google + personal notes + checklist + sanity checker


And that’s the core idea of RAG. Once you get the architecture, you can plug-n-play your own data, swap or add components, and iterate better answers.

Hope you enjoyed reading this! 🚀

11 views

More from this blog

Rushikesh Chaudhari

5 posts

I am an enthusiastic person who loves to learn new things and apply them somewhere❤️. Along with exploring new things, I also loves to share my experiences with others❤️. So I write technical blogs❤️.