====== RAG Application Documentation ======
===== Overview =====
This Retrieval-Augmented Generation (RAG) application integrates Azure OpenAI, Azure AI Search (formerly Azure Cognitive Search), and Azure Blob Storage to enable document ingestion, vector embedding, indexing, and querying of content using natural language.
===== What This App Is =====
This is a **Retrieval-Augmented Generation (RAG)** system using:
* **Azure OpenAI** for embeddings and completions
* **Azure Cognitive Search** for vector-based document retrieval
* **Azure Blob Storage** to store source documents
* **FastAPI** backend for HTTP-based querying
It allows you to ask questions about your documents and receive GPT-based answers grounded in your content — not just pretraining data.
----
===== Why It’s Useful =====
==== Contextual Accuracy ====
* Augments GPT with your **domain-specific** documents
* Greatly reduces hallucinations
==== Enterprise Search Capability ====
* Uses **semantic vector search** to match concepts, not just keywords
* Fast, scalable, and supports ranked retrieval
==== Private and Secure ====
* Documents remain in your **own Azure Blob Storage**
* Hosted in **Azure**, supports enterprise security and compliance
==== Pluggable and Extensible ====
* Easily swap GPT models (`gpt-4`, `gpt-35-turbo`, etc.)
* Add metadata, analytics, monitoring
* Extend with frontend UI or APIs
----
===== Use Cases =====
* Internal knowledge base Q&A
* Compliance audit support
* Customer support assistants
* Product/engineering documentation lookup
* Competitive research summarization
===== File Structure =====
* app.py - FastAPI backend for query handling.
* ingest.py - Extracts documents from Blob Storage, creates embeddings, and indexes them into AI Search.
* requirements.txt - Python dependencies.
* .env - Environment variable configuration (not included here for security).
* README.md - Basic readme file.
* request.json - Sample input format for queries.
===== Prerequisites =====
* Python 3.10 or higher
* Azure OpenAI deployment
* Azure AI Search index (with vector search enabled)
* Azure Blob Storage container with documents
* Required environment variables:
- `AZURE_OPENAI_ENDPOINT`
- `AZURE_OPENAI_API_KEY`
- `AZURE_OPENAI_EMBEDDING_MODEL`
- `AZURE_SEARCH_ENDPOINT`
- `AZURE_SEARCH_KEY`
- `AZURE_SEARCH_INDEX_NAME`
- `AZURE_STORAGE_ACCOUNT_URL`
- `AZURE_STORAGE_CONTAINER_NAME`
- `AZURE_STORAGE_KEY`
===== Setup Instructions =====
* Install dependencies:
pip install -r requirements.txt
* Set environment variables in a `.env` file or export them manually.
* Run the ingest process to index documents:
python ingest.py
* Launch the app locally:
uvicorn app:app --reload
===== How It Works =====
==== Document Ingestion (ingest.py) ====
- Connects to Azure Blob Storage.
- Reads files, extracts content.
- Generates vector embeddings using the OpenAI embedding model.
- Uploads documents + vectors to Azure AI Search index.
==== Query Handling (app.py) ====
- Accepts a query via POST `/query`.
- Embeds the query, runs vector search on the index.
- Retrieves the most relevant documents.
- Builds a prompt with context and sends to OpenAI chat completion model.
- Returns the answer and source document names.
===== Sample Request =====
POST /query
{
"question": "What are the limitations of AFFIRM?"
}
===== Sample Response =====
{
"answer": "AFFIRM lacks automated compliance checks...",
"sources": ["AFFIRM_Notes_-_Max.docx", "Affirm_Bullet_Points.docx"]
}
===== Additional Notes =====
* This app assumes `.docx` files are uploaded to the Blob container.
* Token and prompt limits apply when sending context to the OpenAI model.
* There are prerequisite Azure resources required for this project:
- Azure AI Search with a Search Index and Vector Search Configuration
- Azure OpenAI resources:
- Text Embedding model (text-embedding-ada-002 used here)
- Azure OpenAI gpt model (gpt-4o used here)
- Azure Blob Storage with container for document storage
- Azure Key Vault (Optional for dev but recommended)
- Azure App Service (if not running locally)