Table of Contents

RAG Application Documentation

Overview

This Retrieval-Augmented Generation (RAG) application integrates Azure OpenAI, Azure AI Search (formerly Azure Cognitive Search), and Azure Blob Storage to enable document ingestion, vector embedding, indexing, and querying of content using natural language.

What This App Is

This is a Retrieval-Augmented Generation (RAG) system using:

It allows you to ask questions about your documents and receive GPT-based answers grounded in your content — not just pretraining data.


Why It’s Useful

Contextual Accuracy

Enterprise Search Capability

Private and Secure

Pluggable and Extensible


Use Cases

File Structure

Prerequisites

Setup Instructions

    pip install -r requirements.txt
    
    python ingest.py
    
    uvicorn app:app --reload
    

How It Works

Document Ingestion (ingest.py)

  1. Connects to Azure Blob Storage.
  2. Reads files, extracts content.
  3. Generates vector embeddings using the OpenAI embedding model.
  4. Uploads documents + vectors to Azure AI Search index.

Query Handling (app.py)

  1. Accepts a query via POST `/query`.
  2. Embeds the query, runs vector search on the index.
  3. Retrieves the most relevant documents.
  4. Builds a prompt with context and sends to OpenAI chat completion model.
  5. Returns the answer and source document names.

Sample Request

POST /query
{
  "question": "What are the limitations of AFFIRM?"
}

Sample Response

{
  "answer": "AFFIRM lacks automated compliance checks...",
  "sources": ["AFFIRM_Notes_-_Max.docx", "Affirm_Bullet_Points.docx"]
}

Additional Notes

  1. Azure Blob Storage with container for document storage
  2. Azure Key Vault (Optional for dev but recommended)
  3. Azure App Service (if not running locally)