RAG Application Documentation
Overview
This Retrieval-Augmented Generation (RAG) application integrates Azure OpenAI, Azure AI Search (formerly Azure Cognitive Search), and Azure Blob Storage to enable document ingestion, vector embedding, indexing, and querying of content using natural language.
What This App Is
This is a Retrieval-Augmented Generation (RAG) system using:
Azure OpenAI for embeddings and completions
Azure Cognitive Search for vector-based document retrieval
Azure Blob Storage to store source documents
FastAPI backend for HTTP-based querying
It allows you to ask questions about your documents and receive GPT-based answers grounded in your content — not just pretraining data.
Why It’s Useful
Contextual Accuracy
Enterprise Search Capability
Uses semantic vector search to match concepts, not just keywords
Fast, scalable, and supports ranked retrieval
Private and Secure
Documents remain in your own Azure Blob Storage
Hosted in Azure, supports enterprise security and compliance
Pluggable and Extensible
Easily swap GPT models (`gpt-4`, `gpt-35-turbo`, etc.)
Add metadata, analytics, monitoring
Extend with frontend UI or APIs
Use Cases
Internal knowledge base Q&A
Compliance audit support
Customer support assistants
Product/engineering documentation lookup
Competitive research summarization
File Structure
app.py - FastAPI backend for query handling.
ingest.py - Extracts documents from Blob Storage, creates embeddings, and indexes them into AI Search.
requirements.txt - Python dependencies.
.env - Environment variable configuration (not included here for security).
README.md - Basic readme file.
request.json - Sample input format for queries.
Prerequisites
Python 3.10 or higher
Azure OpenAI deployment
Azure AI Search index (with vector search enabled)
Azure Blob Storage container with documents
Required environment variables:
`AZURE_OPENAI_ENDPOINT`
-
`AZURE_OPENAI_EMBEDDING_MODEL`
`AZURE_SEARCH_ENDPOINT`
`AZURE_SEARCH_KEY`
`AZURE_SEARCH_INDEX_NAME`
`AZURE_STORAGE_ACCOUNT_
URL`
`AZURE_STORAGE_CONTAINER_NAME`
`AZURE_STORAGE_KEY`
Setup Instructions
pip install -r requirements.txt
python ingest.py
uvicorn app:app --reload
How It Works
Document Ingestion (ingest.py)
Connects to Azure Blob Storage.
Reads files, extracts content.
Generates vector embeddings using the OpenAI embedding model.
Uploads documents + vectors to Azure AI Search index.
Query Handling (app.py)
Accepts a query via POST `/query`.
Embeds the query, runs vector search on the index.
Retrieves the most relevant documents.
Builds a prompt with context and sends to OpenAI chat completion model.
Returns the answer and source document names.
Sample Request
POST /query
{
"question": "What are the limitations of AFFIRM?"
}
Sample Response
{
"answer": "AFFIRM lacks automated compliance checks...",
"sources": ["AFFIRM_Notes_-_Max.docx", "Affirm_Bullet_Points.docx"]
}
Additional Notes
This app assumes `.docx` files are uploaded to the Blob container.
Token and prompt limits apply when sending context to the OpenAI model.
There are prerequisite Azure resources required for this project:
Azure AI Search with a Search Index and Vector Search Configuration
Azure OpenAI resources:
Text Embedding model (text-embedding-ada-002 used here)
Azure OpenAI gpt model (gpt-4o used here)
Azure Blob Storage with container for document storage
Azure Key Vault (Optional for dev but recommended)
Azure App Service (if not running locally)