User Tools

Site Tools


wiki:ai:azure-api-monitoring
Draft Newest approved | Approver: @ai-us-principals

This is an old revision of the document!


Azure API Monitoring

Monitoring Azure AI APIs is critical for performance, usage tracking, quota management, and troubleshooting. Azure provides multiple built-in and extensible options to monitor its AI services (like Azure OpenAI, Cognitive Services, and Azure Machine Learning). Here's a breakdown of the available monitoring options:


πŸ” 1. Azure Monitor (Primary Platform Monitoring Tool)

Azure Monitor provides a centralized platform for collecting, analyzing, and acting on telemetry from Azure resources.

Key Features:

  • Metrics: Track request volume, latency, and error rates.
  • Logs (via Log Analytics): Ingest and query detailed activity and diagnostics logs.
  • Alerts: Set up rules to get notified on API failures, high latency, quota breaches, etc.
  • Dashboards: Build visual dashboards to monitor trends.

πŸ“Š 2. Metrics for Azure AI Services

Each AI service exposes its own set of metrics in Azure Monitor:

Common Metrics:

Metric Description
Total Calls Total number of API calls
Successful Calls Count of HTTP 200 responses
Failed Calls Count of 4xx/5xx errors
Latency Response time percentiles (P50, P90, P95, etc.)
Throttled Calls Requests blocked due to quota limits

You can find these under:

Azure Portal β†’ Monitor β†’ Metrics β†’ Select your AI resource

πŸ“œ 3. Diagnostic Settings

You can configure Diagnostic Settings on each Azure AI resource to send logs and metrics to:

  • Log Analytics Workspace
  • Event Hubs
  • Azure Storage

Logs may include:

  • Request logs (time, endpoint, status)
  • Quota usage
  • Custom logs depending on the service

Enable via:

Resource β†’ Monitoring β†’ Diagnostic settings

πŸ“ˆ 4. Application Insights (Optional for Custom Apps)

If you're calling Azure AI APIs from your own application, you can use Application Insights to:

  • Track dependency calls to Azure AI APIs
  • Monitor end-to-end latency and failures
  • View distributed traces and performance bottlenecks
Integrates well with web apps, functions, and APIs

πŸ“‘ 5. Quota and Usage Tracking

For services like Azure OpenAI and Cognitive Services:

  • Azure Quotas API: Track how close you are to usage limits
  • Azure Portal β†’ Usage + Quotas under the AI service

You can set up alerts when usage approaches or exceeds thresholds.


βš™οΈ 6. Azure Machine Learning (if used)

If you're deploying models via Azure ML:

  • Azure ML Studio has its own monitoring for:
    • Endpoint request count
    • Response times
    • Status codes
    • Resource utilization (CPU/Memory)
Studio β†’ Endpoints β†’ Monitoring tab

πŸ§ͺ 7. Custom Monitoring via API Wrappers

You can build wrappers or proxies around API calls to:

  • Log request/response time
  • Capture payloads for analysis
  • Push logs to Azure Log Analytics or external tools (e.g., Splunk)

πŸ” 8. Security and Compliance Monitoring

Use:

  • Microsoft Defender for Cloud: To monitor security posture
  • Azure Policy: To enforce tagging, logging, and retention policies
  • Sentinel (SIEM): For advanced threat detection and analytics

βœ… Best Practices

  • Enable Diagnostic Logs and send to Log Analytics
  • Configure Alerts for latency, error rate, and quota limits
  • Use Workbooks for visual monitoring dashboards
  • Use App Insights for client-side monitoring if you build apps that call the APIs
wiki/ai/azure-api-monitoring.1748450553.txt.gz Β· Last modified: by ddehamer