Deploy Model Context Protocol servers to Azure using Server-Sent Events (SSE) transport, configure Azure Container Apps hosting, set up authentication with Entra ID, implement health monitoring, and establish a CI/CD pipeline.
In this lab you will deploy MCP servers to Azure using SSE (Server-Sent Events) transport. You will containerize your server with Docker, host it on Azure Container Apps, secure it with Entra ID authentication and managed identities, wire up health monitoring and auto-scaling, and establish a CI/CD pipeline for continuous delivery.
A security team needs their MCP servers to be production-hosted and accessible by remote AI clients across the entire organization. not just running locally on individual analyst laptops via stdio.
By deploying to Azure with SSE transport, the team gets a centrally managed, highly available MCP infrastructure that any authorized user or agent can connect to securely from anywhere.
Moving from local stdio to cloud-hosted SSE is the critical step that makes MCP servers production-ready and enterprise-accessible. This lab bridges the gap between development prototypes and real-world deployments. giving your organization a scalable, secure, and observable MCP infrastructure that supports remote AI clients across teams and regions.
containerapp extensionThe MCP specification supports two transport types: stdio (standard input/output, for local process communication) and SSE (Server-Sent Events over HTTP, for remote hosting). SSE enables your MCP server to run in the cloud and serve multiple AI clients concurrently.
GET /sse on your serverconnection_id event to the client over the SSE streamPOST /messages?connectionId=xxx# stdio (local) - for development and single-user scenarios
# โ Client spawns server process as a child process
# โ Communication via stdin/stdout pipes (JSON-RPC messages)
# โ Single client per server instance (1:1 relationship)
# โ Zero network overhead - fastest possible transport
# โ Best for: local development, VS Code integration, testing
# SSE (remote / cloud) - for production and team access
# โ Server runs as an HTTP service (Starlette/FastAPI)
# โ Client connects via HTTP GET (SSE stream) + POST (requests)
# โ Multiple concurrent clients supported (many:1 relationship)
# โ Requires network, authentication, TLS termination
# โ Best for: production, team-wide access, multi-region deploymentUpdate your MCP server to support SSE transport using Starlette as the HTTP framework. The server should support both stdio and SSE based on an environment variable.
# Install HTTP server dependencies for SSE transport
# uvicorn - ASGI server to host the Starlette app
# starlette - lightweight web framework for routing SSE/POST endpoints
# sse-starlette - Server-Sent Events support for streaming responses
pip install uvicorn starlette sse-starletteimport os
import uvicorn
# MCP SDK: SseServerTransport handles the SSE protocol for remote clients
from mcp.server.sse import SseServerTransport
# Starlette: lightweight ASGI web framework for HTTP routing
from starlette.applications import Starlette
from starlette.routing import Route, Mount
from starlette.responses import JSONResponse
# Import your existing MCP server (same tools work with both transports)
from server import server # Import your existing MCP server
# Create SSE transport instance
# "/messages" is the endpoint path where clients POST tool requests
sse_transport = SseServerTransport("/messages")
async def handle_sse(request):
"""Handle SSE connection from MCP clients.
This endpoint keeps a long-lived HTTP connection open.
The client receives tool responses as Server-Sent Events.
Flow: Client GET /sse โ receives connection_id โ sends POST /messages
"""
async with sse_transport.connect_sse(
request.scope, request.receive, request._send
) as (read_stream, write_stream):
await server.run(
read_stream,
write_stream,
server.create_initialization_options()
)
async def handle_messages(request):
"""Handle POST messages from MCP clients.
Tool invocation requests arrive here as JSON-RPC payloads.
The SSE transport routes them to the MCP server for processing.
"""
await sse_transport.handle_post_message(request.scope, request.receive, request._send)
async def health_check(request):
"""Health check endpoint for load balancers and Container Apps probes.
Returns server status, name, transport type, and version.
Container Apps uses this to determine if the container is healthy.
"""
return JSONResponse({
"status": "healthy",
"server": "sentinel-mcp-server",
"transport": "sse",
"version": "1.0.0"
})
# Starlette app with three routes:
# /health - health check for load balancers (no auth required)
# /sse - SSE stream endpoint (clients connect here first)
# /messages - POST endpoint for tool invocation requests
app = Starlette(
debug=os.getenv("DEBUG", "false").lower() == "true",
routes=[
Route("/health", endpoint=health_check),
Route("/sse", endpoint=handle_sse),
Route("/messages", endpoint=handle_messages, methods=["POST"]),
]
)
if __name__ == "__main__":
# Start the HTTP server on all interfaces (0.0.0.0) for container access
port = int(os.getenv("PORT", 8000))
uvicorn.run(app, host="0.0.0.0", port=port)import os
import asyncio
# Select transport based on environment variable
# MCP_TRANSPORT=sse for cloud deployment, stdio for local development
# This lets the same codebase run in both environments
TRANSPORT = os.getenv("MCP_TRANSPORT", "stdio") # stdio or sse
if TRANSPORT == "sse":
# SSE mode: run as a persistent HTTP server
# Accepts multiple concurrent MCP client connections
import uvicorn
from sse_server import app
uvicorn.run(app, host="0.0.0.0", port=int(os.getenv("PORT", 8000)))
else:
# stdio mode: run as a child process of the MCP client
# Single client, communicates via stdin/stdout pipes
from server import server
from mcp.server.stdio import stdio_server
async def main():
async with stdio_server() as (read_stream, write_stream):
await server.run(
read_stream, write_stream,
server.create_initialization_options()
)
asyncio.run(main())Package your MCP server as a Docker container using multi-stage builds to minimise image size and attack surface.
# โโโ Stage 1: Builder โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
FROM python:3.12-slim AS builder
WORKDIR /build
COPY requirements.txt .
RUN pip install --no-cache-dir --target=/build/deps -r requirements.txt
# โโโ Stage 2: Runtime โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
FROM python:3.12-slim
# Security: run as non-root user
RUN useradd --create-home mcpuser
USER mcpuser
WORKDIR /app
# Copy pre-installed dependencies from the builder stage
COPY --from=builder /build/deps /home/mcpuser/.local/lib/python3.12/site-packages
ENV PATH="/home/mcpuser/.local/bin:${PATH}"
ENV PYTHONPATH="/home/mcpuser/.local/lib/python3.12/site-packages"
# Copy source code (owned by mcpuser for security)
COPY --chown=mcpuser:mcpuser src/ ./src/
# Configure for SSE transport by default in cloud deployments
ENV MCP_TRANSPORT=sse
ENV PORT=8000
EXPOSE 8000
# Container health check - used by Docker and orchestrators
# Verifies the /health endpoint returns 200 every 30 seconds
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"
# Start the MCP server (transport selected by MCP_TRANSPORT env var)
CMD ["python", "src/main.py"]mcpuser). This limits the blast radius if the container is compromised. A typical MCP server image should be under 200 MB with multi-stage builds.Build and run the Docker container locally to verify SSE transport works before deploying to Azure.
# Build the Docker image locally
docker build -t sentinel-mcp-server:latest .
# Run the container with .env file and SSE transport enabled
# -d = detached mode, -p = map host port 8000 to container port 8000
docker run -d \
--name mcp-test \
-p 8000:8000 \
--env-file .env \
-e MCP_TRANSPORT=sse \
sentinel-mcp-server:latest
# Verify health endpoint responds with server status
curl http://localhost:8000/health
# Expected: {"status":"healthy","server":"sentinel-mcp-server",...}
# Test SSE connection - should open a persistent stream
# The -N flag disables buffering so you see events in real time
curl -N http://localhost:8000/sse
# Expected: SSE stream opens and sends connection_id event
# View container logs to debug any startup issues
docker logs -f mcp-test
# Clean up the test container when done
docker stop mcp-test && docker rm mcp-test# Point the MCP Inspector at your SSE endpoint (not stdio)
# This tests the full network path: HTTP โ SSE โ MCP protocol
npx @modelcontextprotocol/inspector --sse http://localhost:8000/sse
# In the Inspector UI:
# 1. Verify all tools appear in the Tools tab (same as stdio)
# 2. Test run_kql_query with a simple query
# 3. Test list_sentinel_tables to confirm Azure connectivity
# 4. Verify error handling with an invalid queryCreate an Azure Container Apps environment and deploy your MCP server. Container Apps handles TLS termination, ingress routing, and auto-scaling automatically.
# Create a resource group to hold all MCP server infrastructure
az group create \
--name rg-mcp-servers \
--location eastus \
--tags Project=MCP Environment=Production
# Create Azure Container Registry (ACR) to store Docker images
# ACR provides private, geo-replicated container image storage
az acr create \
--name mcpserversacr \
--resource-group rg-mcp-servers \
--sku Basic \
--admin-enabled true
# Build the Docker image in ACR (cloud build - no local Docker needed)
# This uploads your source code and builds the image in Azure
az acr build \
--registry mcpserversacr \
--image sentinel-mcp-server:v1.0 \
--file Dockerfile .
# Create a Container Apps environment (shared infrastructure layer)
# The environment provides networking, logging, and scaling config
az containerapp env create \
--name mcp-environment \
--resource-group rg-mcp-servers \
--location eastus
# Deploy the MCP server as a Container App
# Container Apps handles TLS, ingress routing, and auto-scaling
# --ingress external = publicly accessible via HTTPS
# --min-replicas 1 = always-on (no cold start delays)
az containerapp create \
--name sentinel-mcp-server \
--resource-group rg-mcp-servers \
--environment mcp-environment \
--image mcpserversacr.azurecr.io/sentinel-mcp-server:v1.0 \
--registry-server mcpserversacr.azurecr.io \
--target-port 8000 \
--ingress external \
--min-replicas 1 \
--max-replicas 5 \
--cpu 0.5 \
--memory 1Gi \
--env-vars \
MCP_TRANSPORT=sse \
AZURE_TENANT_ID=secretref:azure-tenant-id \
AZURE_CLIENT_ID=secretref:azure-client-id
# Get the deployed FQDN (your SSE endpoint URL)
# Output: e.g., sentinel-mcp-server.azurecontainerapps.io
az containerapp show \
--name sentinel-mcp-server \
--resource-group rg-mcp-servers \
--query properties.configuration.ingress.fqdn \
--output tsv# Store sensitive values as Container App secrets (encrypted at rest)
# Use secretref: prefix in env-vars to reference these by name
# Never pass secrets as plain-text environment variables!
az containerapp secret set \
--name sentinel-mcp-server \
--resource-group rg-mcp-servers \
--secrets \
azure-tenant-id="your-tenant-id" \
azure-client-id="your-client-id" \
azure-client-secret="your-client-secret" \
sentinel-workspace-id="your-workspace-id"secretref: references or Azure Key Vault integration. This ensures secrets are encrypted at rest and not visible in configuration logs.Protect your MCP server with Entra ID authentication so only authorised AI clients can invoke your tools.
import jwt
from starlette.middleware import Middleware
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.responses import JSONResponse
# Entra ID configuration for token validation
TENANT_ID = os.environ["AZURE_TENANT_ID"]
AUDIENCE = os.environ.get("MCP_SERVER_APP_ID", "api://sentinel-mcp-server")
# JWKS endpoint provides the public keys for verifying token signatures
JWKS_URL = f"https://login.microsoftonline.com/{TENANT_ID}/discovery/v2.0/keys"
class EntraAuthMiddleware(BaseHTTPMiddleware):
"""Validate Entra ID bearer tokens on all requests.
Ensures only authorized AI clients can invoke MCP tools.
Exempt paths (health check) allow load balancers to probe without auth.
"""
# Paths that don't require authentication (health probes, root)
EXEMPT_PATHS = {"/health", "/"}
async def dispatch(self, request, call_next):
if request.url.path in self.EXEMPT_PATHS:
return await call_next(request)
auth_header = request.headers.get("Authorization", "")
if not auth_header.startswith("Bearer "):
return JSONResponse(
{"error": "Missing Bearer token"},
status_code=401
)
token = auth_header.split(" ", 1)[1]
try:
# Validate the token (simplified. use msal in production)
payload = jwt.decode(
token,
options={"verify_signature": False}, # Use JWKS in production
audience=AUDIENCE
)
request.state.user = payload
return await call_next(request)
except jwt.InvalidTokenError as e:
return JSONResponse(
{"error": f"Invalid token: {e}"},
status_code=401
)
# Add middleware to the Starlette app
app = Starlette(
routes=[...],
middleware=[Middleware(EntraAuthMiddleware)]
)msal or PyJWT library with JWKS key rotation. The simplified example above is for demonstration only.Use system-assigned managed identity to eliminate client secret management entirely. The Container App authenticates to Sentinel and Graph API using its Azure identity.
# Enable system-assigned managed identity on the Container App
# This gives the container its own Azure identity - no secrets needed!
az containerapp identity assign \
--name sentinel-mcp-server \
--resource-group rg-mcp-servers \
--system-assigned
# Get the managed identity's principal ID for role assignments
PRINCIPAL_ID=$(az containerapp identity show \
--name sentinel-mcp-server \
--resource-group rg-mcp-servers \
--query principalId --output tsv)
# Grant the managed identity read access to the Sentinel workspace
# Same role as the app registration, but no secret to manage or rotate
az role assignment create \
--assignee "$PRINCIPAL_ID" \
--role "Log Analytics Reader" \
--scope "/subscriptions//resourceGroups//providers/Microsoft.OperationalInsights/workspaces/"
# For Graph API permissions (e.g., SecurityIncident.ReadWrite.All),
# grant app roles via the Entra portal - managed identity uses the same flow # In your auth.py, update to use managed identity in Azure:
# DefaultAzureCredential automatically detects the environment:
# - In Azure Container Apps: uses the system-assigned managed identity
# - Locally: falls back to az login, VS Code credentials, etc.
from azure.identity import DefaultAzureCredential, ManagedIdentityCredential
def get_credential():
"""Use managed identity in Azure, DefaultAzureCredential locally.
Managed identity eliminates secret management entirely - no expiring
client secrets, no rotation schedules, no accidental leaks.
"""
if os.getenv("AZURE_CLIENT_SECRET"):
# Local development with explicit client secret
return ClientSecretCredential(
tenant_id=os.environ['AZURE_TENANT_ID'],
client_id=os.environ['AZURE_CLIENT_ID'],
client_secret=os.environ['AZURE_CLIENT_SECRET']
)
# In Azure: automatically uses the container's managed identity
# No secrets needed - identity is provided by the Azure platform
return DefaultAzureCredential()Add comprehensive health checks and configure Azure Monitor to collect metrics and diagnostics.
from datetime import datetime, timezone
_start_time = datetime.now(timezone.utc)
async def health_check(request):
"""Detailed health check with dependency status.
Returns: server uptime, version, and connectivity to Sentinel.
Status 200 = healthy, Status 503 = degraded (dependency down).
Container Apps uses this for liveness/readiness probes.
"""
uptime = (datetime.now(timezone.utc). _start_time).total_seconds()
# Probe Sentinel connectivity with a lightweight test query
sentinel_ok = True
try:
logs_client.query_workspace(
workspace_id=WORKSPACE_ID,
query="SecurityIncident | take 1",
timespan=timedelta(minutes=5)
)
except Exception:
sentinel_ok = False
status = "healthy" if sentinel_ok else "degraded"
return JSONResponse({
"status": status,
"uptime_seconds": round(uptime),
"server": "sentinel-mcp-server",
"version": "1.0.0",
"dependencies": {
"sentinel": "connected" if sentinel_ok else "disconnected"
}
}, status_code=200 if sentinel_ok else 503)# Configure health probes in Container Apps
az containerapp update \
--name sentinel-mcp-server \
--resource-group rg-mcp-servers \
--set-env-vars "HEALTH_CHECK_ENABLED=true" \
--yaml health-probes.yaml
# health-probes.yaml
# probes:
# . type: liveness
# httpGet:
# path: /health
# port: 8000
# periodSeconds: 30
# failureThreshold: 3
# . type: readiness
# httpGet:
# path: /health
# port: 8000
# periodSeconds: 10
# initialDelaySeconds: 5SSE connections can drop due to network issues, container restarts, or scaling events. Implement reconnection logic and graceful error handling.
import asyncio
from contextlib import asynccontextmanager
class ConnectionManager:
"""Track and manage active SSE connections."""
def __init__(self):
self.active_connections: dict[str, dict] = {}
self.max_connections = int(os.getenv("MAX_CONNECTIONS", 100))
@asynccontextmanager
async def connect(self, connection_id: str):
if len(self.active_connections) >= self.max_connections:
raise ConnectionError(
f"Maximum connections ({self.max_connections}) reached"
)
self.active_connections[connection_id] = {
"connected_at": datetime.now(timezone.utc).isoformat(),
"last_activity": datetime.now(timezone.utc).isoformat()
}
try:
yield connection_id
finally:
self.active_connections.pop(connection_id, None)
@property
def connection_count(self):
return len(self.active_connections)
connection_manager = ConnectionManager()az containerapp revision restarttc qdisc or Docker network settingsConfigure horizontal auto-scaling to handle varying workloads. Scale based on HTTP concurrent requests and CPU utilisation.
# Configure auto-scaling: scale out based on concurrent HTTP connections
# Container Apps automatically adds/removes replicas to handle load
az containerapp update \
--name sentinel-mcp-server \
--resource-group rg-mcp-servers \
--min-replicas 1 \
--max-replicas 10 \
--scale-rule-name http-scaling \
--scale-rule-type http \
--scale-rule-http-concurrency 50
# View active replicas and revision history
# Useful for monitoring scaling behavior and deployment status
az containerapp revision list \
--name sentinel-mcp-server \
--resource-group rg-mcp-servers \
--output tablemin-replicas=1 for always-on availability. Setting it to 0 saves costs but causes cold-start latency (5–15 seconds) when the first request arrives. For security tooling, always-on is recommended.Automate testing, building, and deployment using GitHub Actions. The pipeline runs tests on every PR and deploys to Azure on merge to main.
name: Deploy MCP Server
on:
push:
branches: [main]
pull_request:
branches: [main]
env:
REGISTRY: mcpserversacr.azurecr.io
IMAGE_NAME: sentinel-mcp-server
RESOURCE_GROUP: rg-mcp-servers
CONTAINER_APP: sentinel-mcp-server
jobs:
test:
runs-on: ubuntu-latest
steps:
. uses: actions/checkout@v4
. uses: actions/setup-python@v5
with:
python-version: '3.12'
. run: pip install -r requirements.txt
. run: pip install pytest pytest-asyncio
. run: pytest tests/ -v
build-and-deploy:
needs: test
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
. uses: actions/checkout@v4
. uses: azure/login@v2
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
. name: Build and push to ACR
run: |
az acr build \
--registry mcpserversacr \
--image ${{ env.IMAGE_NAME }}:${{ github.sha }} \
--image ${{ env.IMAGE_NAME }}:latest .
. name: Deploy to Container Apps
run: |
az containerapp update \
--name ${{ env.CONTAINER_APP }} \
--resource-group ${{ env.RESOURCE_GROUP }} \
--image ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}Implement structured logging with correlation IDs and send logs to Azure Log Analytics for centralised analysis.
# Install OpenTelemetry for distributed tracing and metrics
# Traces provide visibility into tool execution time and failures
pip install opentelemetry-api opentelemetry-sdk \
opentelemetry-exporter-otlp azure-monitor-opentelemetry-exporterfrom opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from azure.monitor.opentelemetry.exporter import AzureMonitorTraceExporter
# Configure OpenTelemetry tracing pipeline
# Traces flow: MCP server โ OTLP exporter โ Azure Monitor / App Insights
provider = TracerProvider()
# Export traces to Application Insights for centralized monitoring
exporter = AzureMonitorTraceExporter(
connection_string=os.environ.get("APPLICATIONINSIGHTS_CONNECTION_STRING")
)
provider.add_span_processor(BatchSpanProcessor(exporter))
trace.set_tracer_provider(provider)
# Create a tracer for this MCP server
tracer = trace.get_tracer("sentinel-mcp-server")
# Use spans in tool handlers to track execution time and metadata
async def handle_kql_query(arguments: dict):
# Create a trace span for each tool invocation
with tracer.start_as_current_span("tool.run_kql_query") as span:
# Add MCP-specific attributes for filtering in App Insights
span.set_attribute("mcp.tool", "run_kql_query")
span.set_attribute("kql.query_length", len(arguments.get("query", "")))
result = await execute_query(arguments)
# Record result metrics on the span
span.set_attribute("kql.row_count", result.get("row_count", 0))
return resultApply security best practices to your production deployment.
# Scan the container image for CVEs using Trivy (open-source scanner)
# Alternatively, enable Defender for Containers for continuous scanning
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
aquasec/trivy:latest image sentinel-mcp-server:latest
# Fix vulnerabilities by pinning to a specific base image version
# Avoid :slim or :latest tags - use exact versions for reproducibility
# In Dockerfile, use specific version tags:
# FROM python:3.12.3-slim (not python:3.12-slim)Create a deployment runbook and verify your production configuration.
# Delete all lab resources when done (removes entire resource group)
# WARNING: This is irreversible - all Container Apps, ACR images, etc.
az group delete --name rg-mcp-servers --yes --no-wait
# Or just scale to zero (preserves config, stops billing for compute)
# You can scale back up later without redeploying
az containerapp update \
--name sentinel-mcp-server \
--resource-group rg-mcp-servers \
--min-replicas 0 --max-replicas 0| Resource | Description |
|---|---|
| MCP Transports | SSE and stdio transport architecture and implementation |
| Azure Container Apps overview | Serverless container hosting for MCP servers |
| Azure App Service overview | Web app hosting platform for MCP endpoints |
| Azure Container Registry | Store and manage Docker container images |
| Ingress in Azure Container Apps | Configure HTTP ingress for SSE endpoints |
| OAuth 2.0 authorization code flow | Secure your MCP server endpoints |