Agentic RAG: Revolutionizing Data Retrieval and Analysis

            8th July 2024


One of our clients, Codeblu, founded by doctors and nurses, who help healthcare workers manage stress with innovative products and services, wanted us to validate their product idea – A chatbot that could intelligently answer questions based on a set of documents they own. The challenge was that these documents contained both unstructured data (summaries, notes, publications, etc.) and structured data (tables, charts, etc.).

After exploring a handful of solutions, we chose to build a Retrieval-Augmented Generation (RAG) application, ultimately enhancing it with Agentic RAG to address specific limitations. This document outlines our journey, the solutions we tried, and the final implementation that met our client’s needs.

What is RAG?

RAG, or Retrieval-Augmented Generation, is a hybrid model that combines the strengths of retrieval-based and generation-based models. It works by retrieving relevant documents or data points from a database (retrieval) and then augments a coherent response based on this information (generation). This approach ensures that the responses are both relevant and contextually accurate, drawing from a rich pool of structured and unstructured data.

How RAG Works

  1. Query Processing: The user inputs a query.
  2. Document Retrieval: The model retrieves relevant documents or data points from the database, most often a vector database.
  3. Response Generation: The retrieved information is then used by the generative model to formulate a response.

While RAG is powerful, it has a notable limitation: distinguishing whether a query requires a response from structured or unstructured data. This distinction is crucial because structured data (like databases) and unstructured data (like text documents) have different characteristics and retrieval methods. 

Structured Data Queries:

  • “What are all the institutes in Pune?”
  • “How many seats are available for Emergency Medicine in BJ Medical College?”
  • “Will I be able to get into the Radiology program for my masters if my rank is around 30000 ?”

These queries typically require precise, numerical computation over structured data like data stored in databases, charts, tables, etc.

Unstructured Data Queries:

  • “What is the application process for NEET PG?”
  • “How many examination center available for NEET PG exam?”

These queries require detailed, context-rich information that is usually found in unstructured data sources like research papers, industry reports, and meeting minutes. If the RAG model only pulled structured data, the responses could lack depth and contextual relevance.

Our Solution: Agentic RAG

To address this limitation, we implemented Agentic RAG. This method enhances the basic RAG approach by incorporating an intelligent agent that can understand the nature of the query and determine the appropriate data source for the response.

Agentic RAG utilizes a combination of natural language processing (NLP) techniques and machine learning algorithms to classify queries. It can dynamically decide whether to pull information from structured databases or unstructured text corpora, ensuring that the responses are not only accurate but also relevant to the context of the query.

Implementing Agentic RAG with OpenAI

  1. OpenAI’s AssistantWe used OpenAI’s Assistant APIs as the main component of our RAG system, ensuring high-quality, human-like responses.
  2. Function Calling: This feature allowed us to integrate various functions that the assistant could call upon to process specific tasks, such as querying databases or retrieving documents.
  3. File Retrieval: We developed a file retrieval system to efficiently access and retrieve documents from our unstructured data sources.

How Agentic RAG Works

  1. Query Classification: The agent, in our case OpenAI Assistant classifies incoming queries to determine if they require structured or unstructured data.
  2. Data Retrieval: Based on the classification, the system retrieves the necessary information from the appropriate document or calls relevant function to data
  3. Response Generation: OpenAI’s assistant generates a response using the retrieved information, ensuring it is appropriate.


Our journey began with the client’s need for a sophisticated chat application. After multiple iterations and extensive testing, we found that the RAG approach, enhanced by intelligent agents, was the best solution. By integrating OpenAI’s Assistants, we created an application that provides accurate and relevant answers from both structured and unstructured data.

RAG Vs Agentic RAG

Better Accuracy: Our system delivered more accurate answers using Agentic RAG. We were able to reduce hallucinations by 95% – great for a MVP stage product.

Faster Responses: We reduced the response time and made the user experience much smoother as the LLM now knows precisely when to access embedded documents and when to call functions.

This project not only helped validate Codeblu’s idea for a chatbot  but also showed how powerful combining cutting-edge AI solutions can be. The success of this implementation highlights Codewalla’s ability to tackle challenges and deliver results.

Scroll to Top