This is an old revision of the document!
Azure Github Actions Pipeline
π§ Demo Objective
Create an automated CI/CD pipeline using GitHub Actions that:
Trains and registers a machine learning model
Deploys it to a managed Azure endpoint
Enables full observability: logging, alerts, and diagnostics
π§± Key Infrastructure Components
Provision these with Bicep or Terraform:
Azure ML Workspace
Azure Key Vault (for secrets like storage keys)
Azure Storage Account (for data input/output)
Azure Container Registry (optional: custom container inference)
Azure Application Insights (for logs and metrics)
Azure Monitor Alert Rules (trigger on failed jobs or degraded endpoints)
Compute cluster (for training, e.g., cpu-cluster)
Azure ML Online Endpoint (for model deployment)
π Repo Structure
plaintextCopyEdit.github/workflows/
βββ train-deploy.yml # GitHub Actions workflow
infra/
βββ main.bicep # Infrastructure as code
ml/
βββ train.py # Model training script
βββ score.py # Inference entry point
βββ environment.yml # Conda environment for training/deployment
βββ register_model.py # Registers trained model
βββ pipeline_job.yml # Azure ML pipeline definition (optional)
π CI/CD Flow (via GitHub Actions)
Trigger: Push to main or model-update branch
Checkout & Install Dependencies
Login to Azure (azure/login GitHub Action)
Set up Azure ML CLI or Python SDK
Run Training Script (optionally via pipeline YAML)
Register Model to Azure ML Registry
Deploy Model to Online Endpoint
Post-deployment Tests
Publish Logs to Application Insights
Trigger Alerts if any step fails (via az monitor alert rules)
π Logging, Monitoring & Alerts
π Demo Enhancements
Dashboards: Include an Azure Dashboard that surfaces training job status, endpoint performance, recent alerts.
Web Frontend (Optional): Simple app to send inference requests, visualize logs.
Cost Control: Auto-shutdown training compute after use.
π§ͺ Example Scenario
Business Case: Retrain a churn prediction model every week using new customer data.
GitHub Actions scheduled trigger: weekly
Logs retraining results
Deploys model if metrics (e.g., accuracy > previous version) pass
Sends alerts if model accuracy drops >10% or job fails
π Security Considerations
Use GitHub Secrets for Azure credentials
Leverage Workload Identity Federation for GitHub-Azure auth
RBAC for least-privilege access to ML and monitoring resources
β
Success Criteria
CI/CD pipeline runs end-to-end on commit
Azure infrastructure deployed from code
Model available at a public or private endpoint
Logs visible in App Insights
Alerts trigger on defined failure conditions