User Tools

Site Tools


wiki:ai:ai_agent_production_architecture_with_rag_and_mcp
Draft Newest approved | Approver: @ai-us-principals

This is an old revision of the document!


Phase 1 – Application Setup and Deployment

This phase established the foundation for the AI Platform by deploying three independent web applications into Azure App Service using GitHub Actions. Each app represents a distinct layer of the AI orchestration pipeline (Agent Gateway → MCP Server → RAG API).

Objectives

  • Deploy all backend components as independent, scalable web apps.
  • Enable CI/CD using GitHub Actions for automated deployment.
  • Prepare each service for integration with Azure Key Vault, Azure AI Search, and Azure OpenAI.
  • Ensure consistent configuration and environment parity across apps.

Application Architecture Overview

Component Purpose Typical Endpoint Description
Agent GatewayEntry point / orchestrator /chat, /api/messages, /healthzReceives user messages and orchestrates downstream services (MCP + RAG).
MCP Server Service layer exposing tools /invoke, /healthz Exposes tool-like APIs (vision, safety, sentiment, OpenAI chat, etc.).
RAG Web App Retrieval-Augmented Generation API/api/answer, /healthz Handles document search, retrieval, and LLM synthesis.

Each app runs independently and communicates securely over HTTPS. The architecture is designed for horizontal scalability and composability.

App Service Creation (Azure Portal)

For each app (RAG, MCP, Agent-Gateway):

  1. Navigate: Azure Portal → App Services → “+ Create”.
  2. Basics:
    • Subscription: active Azure subscription.
    • Resource Group: ai-platform-rg (shared RG for this solution).
    • App names:
      • agent-gateway-app
      • mcp-server-app
      • rag-app
    • Runtime stack: Node 20 (LTS)
    • Region: preferred region (e.g., East US)
  3. Deployment: enable GitHub Actions (CI/CD).
  4. Networking: public access for Agent-Gateway only; others locked down later.
  5. Monitoring: enable Application Insights (same region).

Click Create → Review → Deploy.
Result — Three independent web apps ready for deployment.

Repository Structure

/apps

├─ /agent-gateway

│ ├─ src/

│ ├─ package.json

│ ├─ index.js

│ └─ orchestrator.js

├─ /mcp-server

│ ├─ src/

│ ├─ package.json

│ └─ http.js

└─ /rag-app

├─ src/

├─ package.json

└─ rag.py / server.js

.github/

└─ workflows/

├─ deploy-mcp.yml

├─ deploy-agent-gateway.yml

└─ deploy-rag.yml

Each app contains its own build and runtime configuration.
The .github/workflows folder defines independent CI/CD pipelines.

CI/CD Setup Using GitHub Actions

Each web app uses a lightweight ZipDeploy workflow for Azure App Service:

name: Deploy · MCP Server (ZipDeploy)

on:

push:

paths:

- 'apps/mcp-server/**'

workflow_dispatch:

jobs:

zip-and-deploy:

runs-on: ubuntu-latest

steps:

- name: Checkout

uses: actions/checkout@v4

- name: Create zip

run: |

cd apps/mcp-server

zip -r ../../mcp-server.zip . -x “.git/*” “.github/*”

- name: Deploy to Azure Web App

uses: azure/webapps-deploy@v3

with:

app-name: mcp-server-app

publish-profile: $secrets.azure_publish_profile_mcp

package: mcp-server.zip

Each app has its own publish-profile secret (AZURE_PUBLISH_PROFILE_MCP, …_RAG, …_GATEWAY) stored in GitHub → Settings → Secrets and Variables → Actions.
Result — pushing to the respective folder triggers automatic deployment.

Environment Configuration

Before first deployment, each web app’s configuration was set under:
Azure Portal → App Service → Configuration → Application Settings

Example (Agent-Gateway):

PORT=8080

BOT_APP_ID=<Bot registration ID>

BOT_APP_PASSWORD=<Bot registration secret>

KEY_VAULT_URL=https://<your-keyvault>.vault.azure.net

AOAI_ENDPOINT=https://<your-aoai-resource>.openai.azure.com

AOAI_API_VERSION=2024-12-01-preview

Result — All apps share a consistent configuration model with environment variables mirrored locally via .env.example.

Local Testing

# From app folder

npm install

npm run build

npm start

# Health check

curl http://localhost:8080/healthz

Ensured each service worked independently before deployment.

Post-Deployment Verification

After each successful GitHub Action run:

curl -I https://mcp-server-app.azurewebsites.net/healthz

curl -I https://agent-gateway-app.azurewebsites.net/healthz

curl -I https://rag-app.azurewebsites.net/healthz

Expected response:

HTTP/1.1 200 OK

Content-Type: application/json

Result — All three web apps responding successfully.

Phase 2 – Backend Configuration & Environment Setup

In this phase you’ll wire up configuration for all three apps (Agent Gateway, MCP Server, RAG Web App), move secrets to Key Vault, and establish clean service-to-service contracts. Everything below uses generic names so you can reuse it anywhere.

1) Objectives

  • Centralize secrets in Azure Key Vault and consume them via Managed Identity or Key Vault references.
  • Standardize environment variables across apps.
  • Configure CORS, health endpoints, and base URLs to connect Gateway ↔ MCP ↔ RAG.
  • Verify with smoke tests and add guardrails to catch misconfiguration early.

2) Prerequisites

  • Three App Services exist: agent-gateway-app, mcp-server-app, rag-app.
  • One Key Vault: ai-platform-kv.
  • (Optional but recommended) Application Insights instances already linked (Phase 1).

3) Environment Variable Model (What each app needs)

Agent Gateway (entry point / orchestrator)

  • Service endpoints
    • MCP_BASE_URL=https://mcp-server-app.azurewebsites.net
    • RAG_API_URL=https://rag-app.azurewebsites.net/api/answer
  • CORS
    • CORS_ALLOWED_ORIGINS=https://your-frontend.example.com,https://other-site.example.com
  • Azure OpenAI
    • AOAI_ENDPOINT=https://<your-aoai>.openai.azure.com
    • AOAI_API_VERSION=2024-12-01-preview
    • AOAI_CHAT_DEPLOYMENT=<chat-deployment-name>
    • AOAI_THERAPIST_DEPLOYMENT=<therapist-deployment-name>
    • AOAI_APIKEY_SECRET_NAME=aoai-api-key (Key Vault secret name)
  • Vision / Text Analytics / Safety
    • VISION_ENDPOINT=https://<your-vision>.cognitiveservices.azure.com
    • VISION_KEY_SECRET_NAME=vision-key
    • TA_ENDPOINT=https://<your-ta>.cognitiveservices.azure.com
    • TA_KEY_SECRET_NAME=text-analytics-key
    • SAFETY_ENDPOINT=https://<your-safety>.cognitiveservices.azure.com
    • SAFETY_KEY_SECRET_NAME=content-safety-key
    • SAFETY_SELFHARM_THRESHOLD=2 (example)
  • Key Vault URL (if your code pulls at runtime)
    • KEY_VAULT_URL=https://ai-platform-kv.vault.azure.net
  • Telemetry (optional)
    • APPLICATIONINSIGHTS_CONNECTION_STRING=<from App Insights>

MCP Server (tool backend)

  • Port / basics
    • PORT=8080
  • Provider endpoints & policy hints (optional)
    • KEY_VAULT_URL=https://ai-platform-kv.vault.azure.net
    • POLICY_OPENAI=<endpoint|deployment,…>
    • POLICY_VISION=https://<vision>.cognitiveservices.azure.com
    • POLICY_TA=https://<ta>.cognitiveservices.azure.com
    • POLICY_SAFETY=https://<safety>.cognitiveservices.azure.com
    • POLICY_SEARCH=https://<search>.search.windows.net|<index-name>
    • POLICY_STORAGE=https://<storage>.blob.core.windows.net|<container>

RAG Web App

  • Retrieval
    • SEARCH_SERVICE_URL=https://<search>.search.windows.net
    • SEARCH_INDEX_NAME=<index-name>
    • SEARCH_APIKEY_SECRET_NAME=ai-search-api-key (Key Vault secret name)
  • Storage (if used)
    • STORAGE_ACCOUNT_URL=https://<storage>.blob.core.windows.net
    • STORAGE_CONTAINER=<container>
  • OpenAI for synthesis (if applicable)
    • AOAI_ENDPOINT=https://<your-aoai>.openai.azure.com
    • AOAI_API_VERSION=2024-12-01-preview
    • AOAI_CHAT_DEPLOYMENT=<chat-deployment-name>
    • AOAI_APIKEY_SECRET_NAME=aoai-api-key
  • Health
    • PORT=8080 (or framework default)

Note: If your code reads secrets at runtime from Key Vault (SDK), you need KEY_VAULT_URL and Managed Identity.
If you prefer App Settings Key Vault references, you don’t need runtime SDK calls (see Section 5).

4) Configure App Settings (Azure Portal)

For each app:

  1. Azure Portal → App Services → {app} → Configuration → Application settings → + New application setting.
  2. Add the non-secret variables from Section 3.
  3. SaveRestart app.

Use the exact URLs your services expose. Do not include /chat or /healthz in base URLs; those are endpoint paths used by your frontend or health checks.

5) Move Secrets to Key Vault

A) Enable Managed Identity on each App

  • App → Identity → System assigned → On → Save (note the Object ID).

B) Grant the App access to Key Vault

  • Key Vault → Access control (IAM) → Add role assignment:
    • Role: Key Vault Secrets User
    • Assign to: the app’s system-assigned managed identity
    • Save.

C) Create secrets in Key Vault

  • Key Vault → Secrets → + Generate/Import:
    • aoai-api-key
    • vision-key
    • text-analytics-key
    • content-safety-key
    • ai-search-api-key
    • (any other provider keys)

D) Reference secrets from App Settings (no code change)

In each app → Configuration → Application settings, set values like:

AOAI_API_KEY = @Microsoft.KeyVault(SecretUri=https://ai-platform-kv.vault.azure.net/secrets/aoai-api-key/<version>)

VISION_KEY = @Microsoft.KeyVault(SecretUri=https://ai-platform-kv.vault.azure.net/secrets/vision-key/<version>)

TA_KEY = @Microsoft.KeyVault(SecretUri=https://ai-platform-kv.vault.azure.net/secrets/text-analytics-key/<version>)

SAFETY_KEY = @Microsoft.KeyVault(SecretUri=https://ai-platform-kv.vault.azure.net/secrets/content-safety-key/<version>)

SEARCH_API_KEY = @Microsoft.KeyVault(SecretUri=https://ai-platform-kv.vault.azure.net/secrets/ai-search-api-key/<version>)

  • You can use either a dedicated *_KEY variable (above) or keep the *_SECRET_NAME pattern if your code looks up secrets at runtime.
  • SaveRestart.

Tip: Prefer Key Vault references when possible (fewer moving parts). Use runtime SDK only when you must fetch secrets dynamically.

6) CORS (Agent Gateway)

  • App → Configuration → Application settings:
    • CORS_ALLOWED_ORIGINS=https://your-frontend.example.com
  • In code (Express CORS):
    • Ensure you read and enforce this list (origin allow-list).
  • If using the portal CORS blade (older UI), keep settings consistent with app settings.

7) Health Endpoints & Readiness

Ensure each app exposes a JSON health endpoint:

  • Agent Gateway: GET /healthz → { “ok”: true }
  • MCP Server: GET /healthz → { “ok”: true }
  • RAG App: GET /healthz → { “ok”: true }

In the Portal: App → Monitoring → Health check → set the path to /healthz and Enable.
This lets App Service remove unhealthy instances from rotation.

8) Wire Up Service Contracts

Agent Gateway → MCP

  • Uses: MCP_BASE_URL
  • Contract (example):
    • POST {MCP_BASE_URL}/invoke with body { method, params }

Agent Gateway → RAG

  • Uses: RAG_API_URL (full path to /api/answer)
  • Contract (example):
    • GET/POST {RAG_API_URL}?q=… or JSON { query, topK, temperature }
    • Response: { answer, sources/sampleRetrieved }

Keep these contracts versioned in code (v1 query shape) so you can evolve independently.

9) Smoke Tests (CLI)

Replace hostnames as appropriate.

# Health

curl -s -w “\n%{http_code}\n” https://agent-gateway-app.azurewebsites.net/healthz

curl -s -w “\n%{http_code}\n” https://mcp-server-app.azurewebsites.net/healthz

curl -s -w “\n%{http_code}\n” https://rag-app.azurewebsites.net/healthz

# Agent Gateway chat (no attachments)

curl -s -X POST https://agent-gateway-app.azurewebsites.net/chat \

-H “content-type: application/json” \

-d '{“message”:“hello from curl”,“locale”:“en”}' | jq .

# Direct RAG query (POST shape)

curl -s -X POST https://rag-app.azurewebsites.net/api/answer \

-H “content-type: application/json” \

-d '{“query”:“what is our refund policy?”,“topK”:3,“temperature”:0.2}' | jq .

Expected:

  • Health endpoints return 200 with a small JSON body.
  • /chat returns { text, citations?, sentiment?, routed }.
  • /api/answer returns { answer, … }.

Phase 3 – Security Hardening (Network & Identity)

This phase locks down the perimeter, enforces private-only service-to-service traffic, and moves all sensitive access to identity-based auth. The guidance below uses generic names so you can reuse it anywhere.

1) Objectives

  • Make MCP Server and RAG Web App private-only (no direct public access).
  • Keep Agent Gateway as the public entry point—but restrict who can hit it.
  • Ensure name resolution works privately via Private DNS Zones.
  • Enforce Managed Identity + Key Vault for secrets (no plaintext keys).
  • Lock down Kudu/SCM, disable legacy endpoints, and enforce modern TLS.

2) Prerequisites

  • App Services: agent-gateway-app, mcp-server-app, rag-app
  • A Virtual Network: ai-platform-vnet with at least two subnets:
    • apps-integration-snet (for VNet Integration)
    • private-endpoints-snet (for Private Endpoints)
  • Private DNS zone: privatelink.azurewebsites.net linked to ai-platform-vnet
  • Key Vault: ai-platform-kv

Keep VNet Integration subnets and Private Endpoint subnets separate.

3) Private Endpoints (Inbound Isolation)

Goal: mcp-server-app and rag-app are reachable only via private IPs from your VNet.

Portal steps (per private app):

  1. App → NetworkingPrivate endpoint connections+ Add
  2. Choose ai-platform-vnet and private-endpoints-snet.
  3. Create → then Approve the connection if pending.

Validate:

  • App → Networking → Private endpoints should show Approved + a 10.x private IP.

4) Private DNS Zone (Name Resolution)

Goal: *.azurewebsites.net for your apps resolves to private IP inside VNet.

Portal steps:

  1. Private DNS zones → open privatelink.azurewebsites.net.
  2. Virtual network links → ensure ai-platform-vnet is linked.
  3. Record sets → confirm A-records exist:
    • mcp-server-app → 10.x.x.x
    • rag-app → 10.x.x.x
    • (Optional) agent-gateway-app if you created a PE for it too

Validate from Agent Gateway console (SSH / Kudu):

nslookup mcp-server-app.azurewebsites.net

curl -I https://mcp-server-app.azurewebsites.net/healthz

nslookup rag-app.azurewebsites.net

curl -I https://rag-app.azurewebsites.net/healthz

Expect resolution to 10.x addresses and HTTP 200/OK (or your expected status).

5) Access Restrictions (Deny Public, Allow VNet)

Goal: Only allowed subnets (e.g., Agent Gateway’s VNet Integration subnet) can call private apps.

Portal steps (for mcp-server-app and rag-app):

  1. App → NetworkingAccess restrictionsConfigure Access Restrictions.
  2. Main site tab:
    • + Add ruleAllow
      • Name: Allow-AppsVNet
      • Type: Virtual Network → choose ai-platform-vnet & apps-integration-snet
      • Priority: 100
    • + Add ruleDeny
      • Name: Deny-Internet
      • Type: Service Tag → Internet
      • Priority: 200
  3. (Optional but recommended) Repeat the same under the SCM site tab to lock down Kudu.

Validate:

  • From your laptop (public) a direct hit to private apps should return 403/Forbidden or time out.
  • From Agent Gateway container: requests should succeed.

6) VNet Integration (Outbound Control)

Goal: Ensure outbound from apps traverses your VNet → you can apply NSGs/UDRs, and reach private services.

Portal steps (each app):

  1. App → NetworkingVNet Integration+ Add VNet
  2. Select ai-platform-vnet and apps-integration-snet.
  3. Wait until status = Connected.

NSG recommendations (attach to apps-integration-snet):

  • Allow outbound to:
    • Key Vault: AzureKeyVault service tag
    • Cognitive Services: CognitiveServicesManagement (and/or region FQDNs)
    • Storage: Storage service tag or specific accounts
    • Azure OpenAI endpoint FQDN (if not covered by service tag)
  • Deny outbound: final Deny All to restrict unexpected egress.

If using Private Endpoints for Key Vault/Cognitive/Search/Storage, allow to their private IPs instead.

7) Agent Gateway Ingress Strategy

Goal: Keep agent-gateway-app public, but limit who can hit it.

Options (choose one):

  • Front Door / WAF (recommended): put Azure Front Door with WAF in front; only allow Front Door’s backend health probe/IPs in Access Restrictions.
  • IP allow-list: If you don’t have WAF yet, allow a short list of source IPs (your office/VPN) and deny the Internet below them.

Portal steps:

  1. Agent Gateway → Networking → Access Restrictions
  2. Allow rules for Front Door service tag or specific IPs
  3. Deny rule for Internet below
  4. Save

8) TLS, SCM & Legacy Protocols

  • HTTPS Only: App → TLS/SSL settingsHTTPS Only: On
  • Minimum TLS: set 1.2 (or 1.3 if available)
  • Disable FTP: App → Deployment Center → FTPS stateFTPS Only or Disabled
  • Lock down Kudu: mirror Access Restrictions in SCM site tab

9) Managed Identity + Key Vault (Identity, Not Keys)

Goal: Apps authenticate to resources using system-assigned managed identities and fetch secrets via Key Vault.

Portal steps (each app):

  1. App → Identity → System assigned → On → Save
  2. Key Vault → Access Control (IAM) → Add role assignment
    • Role: Key Vault Secrets User
    • Assign to: app’s managed identity
  3. App → Configuration → Application settings
    • Use Key Vault references for secrets:
    • AOAI_API_KEY = @Microsoft.KeyVault(SecretUri=https://ai-platform-kv.vault.azure.net/secrets/aoai-api-key/<version>)
    • SEARCH_API_KEY = @Microsoft.KeyVault(SecretUri=https://ai-platform-kv.vault.azure.net/secrets/ai-search-api-key/<version>)
  4. Remove any plaintext keys from App Settings after references are active.

If your code uses the SDK to fetch secrets dynamically, also set KEY_VAULT_URL and ensure outbound to KV is allowed.

10) CORS (Browser Clients)

Goal: Limit cross-origin browser calls to approved frontends.

Portal steps (Agent Gateway):

  • App → Configuration → Application settings
    • CORS_ALLOWED_ORIGINS=https://frontend.example.com,https://staging-frontend.example.com
  • Ensure Express cors() middleware reads this env and enforces it.
  • Keep Azure Portal CORS blade empty or in sync—avoid conflicts.

11) Validation Checklist

From Agent Gateway container (SSH):

# Private name resolution & reachability

nslookup mcp-server-app.azurewebsites.net

nslookup rag-app.azurewebsites.net

curl -s -w “\n%{http_code}\n” https://mcp-server-app.azurewebsites.net/healthz

curl -s -w “\n%{http_code}\n” https://rag-app.azurewebsites.net/healthz

From your laptop (public):

# Should be allowed only to Agent Gateway

curl -I https://agent-gateway-app.azurewebsites.net/healthz

# Direct hits to private apps should fail

curl -I https://mcp-server-app.azurewebsites.net/healthz # expect 403/timeout

curl -I https://rag-app.azurewebsites.net/healthz # expect 403/timeout

Key Vault access (app logs):

  • Confirm no “permission denied” messages fetching secrets.
  • Confirm your app can read Key Vault references (values resolve at runtime).

Phase 4 – Monitoring, Logging & Alerts

This phase focuses on making the AI Platform production-ready through observability. You’ll connect all three web apps (Agent Gateway, MCP Server, RAG App) to Application Insights, enable structured logging, and add alerting and dashboards for proactive monitoring.

1) Objectives

  • Centralize logs, traces, and performance metrics in Application Insights
  • Enable distributed tracing between services (Gateway → MCP → RAG)
  • Collect key telemetry (latency, errors, throughput)
  • Configure Azure Alerts and Dashboards for real-time visibility
  • Enforce diagnostic settings and retention policies

2) Application Insights Integration

A) Enable per-app Insights

  1. Portal → App Service → Monitoring → Application Insights
  2. Click Turn On Application Insights
  3. Choose:
    • Existing resource: ai-platform-appinsights
    • Sampling: Adaptive (default)
    • Collection level: Recommended (auto instrument logs & dependencies)

Repeat for:

  • agent-gateway-app
  • mcp-server-app
  • rag-app

B) Verify connection

After enabling:

curl -I https://agent-gateway-app.azurewebsites.net/healthz

→ In the Insights blade, you should soon see requests appearing under “Requests”.

3) Enable OpenTelemetry (if using custom logging)

For more control and cross-service correlation, configure Azure Monitor OpenTelemetry Exporter.

Example (Node.js):

import { AzureMonitorTraceExporter } from “@azure/monitor-opentelemetry-exporter”;

import { NodeSDK } from “@opentelemetry/sdk-node”;

import { getNodeAutoInstrumentations } from “@opentelemetry/auto-instrumentations-node”;

const exporter = new AzureMonitorTraceExporter({

connectionString: process.env.APPLICATIONINSIGHTS_CONNECTION_STRING,

});

const sdk = new NodeSDK({

traceExporter: exporter,

instrumentations: [getNodeAutoInstrumentations()],

});

sdk.start();

  • Add this snippet at app startup (before Express routes).
  • Ensure APPLICATIONINSIGHTS_CONNECTION_STRING is defined in environment variables.
  • Repeat in each backend service where applicable.

Result: Distributed traces now flow between Agent Gateway → MCP → RAG automatically.

4) Logging Strategy

A) Use structured logging

Replace raw console.log() with structured JSON:

console.info(JSON.stringify({

level: “info”,

service: “agent-gateway”,

message: “Chat request processed”,

latency_ms: 120,

routed: “rag”,

}));

B) Application Insights automatic collection

  • Requests, Dependencies, Exceptions, and Console logs are auto-collected.
  • Each log automatically includes cloud_RoleName (your App name) for filtering.

C) Enable diagnostic logs to storage

Portal → App Service → Monitoring → Diagnostic settings+ Add diagnostic setting

  • Check: AppServiceConsoleLogs, AppServiceHTTPLogs, AppServiceAppLogs
  • Destination: Log Analytics workspace or Storage account
  • Retention: 30–90 days

This ensures long-term retention beyond Application Insights default (90 days).

5) Custom Metrics

Emit custom metrics (latency, errors, token usage) to Insights:

import appInsights from “applicationinsights”;

const client = appInsights.defaultClient;

client.trackMetric({ name: “chatLatencyMs”, value: latency });

client.trackMetric({ name: “ragHitCount”, value: hits.length });

client.trackException({ exception: err });

Or, if using OpenTelemetry:

const meter = sdk.getMeterProvider().getMeter(“agent-gateway”);

const counter = meter.createCounter(“rag_requests”);

counter.add(1, { route: “chat”, success: true });

6) Alerts

Portal Steps:

  1. Navigate to Application Insights → Alerts → + Create → Alert rule
  2. Scope: select all three App Insights resources
  3. Condition examples:
    • Failed requests > 1 in 5 minutes
    • Average server response time > 2000 ms
    • Availability < 99%
  4. Action group: create one if none exists
    • Notifications: Email, Teams, or PagerDuty webhook
    • Name: ai-platform-alerts
  5. Severity:
    • 0 – Critical (app down)
    • 1 – Error rate > 10 %
    • 2 – Slow response (> 2 s)
  6. Enable alert rule.

You’ll now receive alerts for downtime or performance degradation.

7) Dashboards

Portal → Dashboard → + New → Add Tiles → Application Insights:

Recommended widgets:

  • Availability (from Health checks)
  • Server response time
  • Request count / failure count
  • Dependency calls (shows MCP → RAG hops)
  • Exceptions by service
  • Custom metrics (e.g., latency, token usage)

Save as AI Platform – Observability Dashboard.

8) Availability Tests

  1. Application Insights → Availability → + Add Standard test
  2. URL:
    • https://agent-gateway-app.azurewebsites.net/healthz
  3. Locations: 3+ regions (e.g., East US, West Europe, Central US)
  4. Test frequency: every 5 minutes
  5. Failure threshold: 2 / 3 checks
  6. Action group: ai-platform-alerts

Detects public outages even if the Gateway is otherwise healthy behind a load balancer.

9) Defender for App Service (Optional)

Enable Microsoft Defender for App Service:

  1. Azure Portal → Microsoft Defender for CloudEnvironment settings
  2. Turn On → App Service Plan
  3. Defender scans for:
    • Outdated packages
    • Insecure configuration
    • Abnormal behavior (e.g., crypto-mining)
    • Known exploit signatures

Integrates with Security Center for ongoing posture management.

Phase 5 – Frontend Creation & Integration

This phase delivers a lightweight, test-friendly frontend that can exercise the full pipeline (Agent Gateway ↔ MCP ↔ RAG), provides quick health visibility, and is safe to ship publicly. It assumes Phases 1–4 are complete.

1) Objectives

  • Build a small Vite + React SPA for health checks and unified chat.
  • Wire the UI to Agent Gateway (primary), with optional direct calls to MCP/RAG for diagnostics.
  • Configure CORS, timeouts/retries, and error handling.
  • Ship with zero secrets in the browser.
  • Deploy via Azure Static Web Apps (recommended) or Static Website on Storage.

2) Frontend Repo Layout

/frontend

├─ src/

│ ├─ App.tsx # UI (health cards + chat)

│ ├─ main.tsx # React root

│ └─ index.css

├─ index.html

├─ package.json

├─ tsconfig.json

├─ vite.config.ts

└─ .env.local.example # VITE_* endpoints (no secrets)

If you prefer plain JS, the same structure works—only types differ.

3) Environment Variables (Client-side)

Create /frontend/.env.local (not committed) using public base URLs only—never secrets.

# Points to public entry point (Front Door or direct App Service)

VITE_GATEWAY_BASE=https://agent-gateway-app.azurewebsites.net

# Optional (diagnostics-only; keep disabled in prod):

# VITE_MCP_BASE=https://mcp-server-app.azurewebsites.net

# VITE_RAG_BASE=https://rag-app.azurewebsites.net

Why only the gateway? MCP and RAG are private in Phase 3. Keep direct calls disabled outside of troubleshooting.

4) CORS & Networking

  • Agent Gateway must allow your frontend origin(s):

CORS_ALLOWED_ORIGINS=https://your-frontend.domain

  • If using Azure Front Door in front of the gateway, use the Front Door URL in VITE_GATEWAY_BASE and allow that origin in CORS too.
  • Keep MCP/RAG private; the frontend should not call them directly in production.

5) Minimal UI Capabilities

A) Health Cards

  • Buttons to GET /healthz on Gateway (and optionally MCP/RAG).
  • Shows JSON + HTTP status.

B) Unified Chat

  • POST Gateway /chat with:
  • { “message”: “text”, “locale”: “en”, “attachments”: [{ “base64”: “…”, “contentType”: “image/png” }] }
  • Display text, routed, citations[], and optional sentiment.

C) Error Handling & Timeouts

Fetch wrapper:

async function fetchJson(url: string, init?: RequestInit, timeoutMs = 30000) {

const ctl = new AbortController();

const t = setTimeout1)); }

}

throw err;

}

D) No Secrets in Browser

  • Never put API keys in VITE_*.
  • Use Gateway to access protected services.
  • For auth needs later, integrate AAD (MSAL) against the Gateway—not direct provider keys.

6) Production Build & Local Dev

# local dev

cd frontend

npm install

npm run dev # default: http://localhost:5173

# production build

npm run build # outputs dist/

npm run preview

Smoke tests:

  • Health: curl -I “$VITE_GATEWAY_BASE/healthz”
  • Chat: curl -s -X POST “$VITE_GATEWAY_BASE/chat” -H “content-type: application/json” -d '{“message”:“hello”}'

7) Deployment Options

Option A — Azure Static Web Apps (Recommended)

  • Simple, global distribution, automatic CI/CD.

Portal:

  1. Create resource: Static Web AppFrom GitHub.
  2. App location: /frontend
  3. Build command: npm run build
  4. Output location: dist

GitHub Action (generated) will resemble:

name: Build & Deploy · Frontend

on:

push:

paths: ['frontend/**']

workflow_dispatch:

jobs:

build_and_deploy:

runs-on: ubuntu-latest

steps:

- uses: actions/checkout@v4

- uses: actions/setup-node@v4

with: { node-version: 20 }

- run: |

cd frontend

npm ci

npm run build

- name: Upload artifact

uses: actions/upload-pages-artifact@v3

with:

path: frontend/dist

- name: Deploy to Static Web Apps

uses: Azure/static-web-apps-deploy@v1

with:

azure_static_web_apps_api_token: $secrets.azure_static_web_apps_api_token

action: upload

app_location: frontend

output_location: dist

Add your SWA domain (and custom domain) to CORS_ALLOWED_ORIGINS.

Option B — Static Website on Azure Storage

  • Cheap & simple; fronted by CDN optional.

Portal:

  1. Storage Account → Static websiteEnable → Index: index.html, Error: index.html.
  2. Upload dist/ to $web container.
  3. Set CORS_ALLOWED_ORIGINS to the storage static site domain (and your custom domain if configured).
  4. (Optional) Put Front Door/ CDN in front for HTTPS/custom domain.

8) Frontend Telemetry (Optional)

If you want page load/error metrics:

  • Add Application Insights JS SDK (or Azure Monitor OpenTelemetry Web).
  • Configure sampling and PII redaction.
  • Send only lightweight events (avoid sending full prompts/answers).

Example (App Insights JS):

import { ApplicationInsights } from '@microsoft/applicationinsights-web';

const ai = new ApplicationInsights({ config: {

connectionString: import.meta.env.VITE_APPINSIGHTS_CONNECTION_STRING,

enableAutoRouteTracking: true,

disableExceptionTracking: false,

samplingPercentage: 50,

}});

ai.loadAppInsights();

// Usage

ai.trackEvent({ name: 'chat_submit' });

ai.trackMetric({ name: 'ui_latency_ms', average: 120 });

Add VITE_APPINSIGHTS_CONNECTION_STRING only if you want client telemetry; otherwise omit.

1)
) ⇒ ctl.abort(), timeoutMs); try { const res = await fetch(url, { …init, signal: ctl.signal }); const body = await res.text(); if (!res.ok) throw new Error(`HTTP ${res.status} ${res.statusText} — ${body.slice(0, 600)}`); return body ? JSON.parse(body) : {}; } finally { clearTimeout(t); } } Retry primitive (idempotent GETs): async function withRetry<T>(fn: () ⇒ Promise<T>, attempts = 2) { let err: unknown; for (let i = 0; i ⇐ attempts; i++) { try { return await fn(); } catch (e) { err = e; await new Promise(r ⇒ setTimeout(r, 500*(i+1
wiki/ai/ai_agent_production_architecture_with_rag_and_mcp.1762531983.txt.gz · Last modified: by bgourley