0
0
Azurecloud~5 mins

Auto scaling App Service in Azure - Commands & Configuration

Choose your learning style9 modes available
Introduction
When your web app gets more visitors, it can slow down or crash. Auto scaling helps your app add or remove resources automatically to handle changes in traffic without you doing anything.
When your website traffic changes a lot during the day and you want it to stay fast.
When you want to save money by not running more servers than needed.
When you expect sudden spikes in users, like during a sale or event.
When you want your app to stay available even if one server fails.
When you want to avoid manually changing server numbers as demand changes.
Config File - autoscale-settings.json
autoscale-settings.json
{
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/autoscale.json#",
  "name": "autoscaleAppService",
  "location": "eastus",
  "properties": {
    "profiles": [
      {
        "name": "AutoScaleProfile",
        "capacity": {
          "minimum": "1",
          "maximum": "3",
          "default": "1"
        },
        "rules": [
          {
            "metricTrigger": {
              "metricName": "Percentage CPU",
              "metricNamespace": "",
              "metricResourceUri": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/example-rg/providers/Microsoft.Web/sites/example-app",
              "timeGrain": "PT1M",
              "statistic": "Average",
              "timeWindow": "PT5M",
              "timeAggregation": "Average",
              "operator": "GreaterThan",
              "threshold": 70
            },
            "scaleAction": {
              "direction": "Increase",
              "type": "ChangeCount",
              "value": "1",
              "cooldown": "PT5M"
            }
          },
          {
            "metricTrigger": {
              "metricName": "Percentage CPU",
              "metricNamespace": "",
              "metricResourceUri": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/example-rg/providers/Microsoft.Web/sites/example-app",
              "timeGrain": "PT1M",
              "statistic": "Average",
              "timeWindow": "PT5M",
              "timeAggregation": "Average",
              "operator": "LessThan",
              "threshold": 30
            },
            "scaleAction": {
              "direction": "Decrease",
              "type": "ChangeCount",
              "value": "1",
              "cooldown": "PT5M"
            }
          }
        ]
      }
    ],
    "enabled": true,
    "targetResourceUri": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/example-rg/providers/Microsoft.Web/sites/example-app"
  }
}

This JSON file sets up auto scaling for an Azure App Service named example-app in the example-rg resource group.

The profiles section defines rules: if CPU usage goes above 70% for 5 minutes, it adds one instance; if it drops below 30%, it removes one instance. The app will have at least 1 and at most 3 instances running.

The targetResourceUri points to the app service to scale.

Commands
This command creates an auto scale setting for the app service with minimum 1, maximum 3 instances, and default 1 instance.
Terminal
az monitor autoscale create --resource-group example-rg --resource example-app --resource-type Microsoft.Web/sites --name autoscaleAppService --min-count 1 --max-count 3 --count 1
Expected OutputExpected
{ "id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/example-rg/providers/microsoft.insights/autoscalesettings/autoscaleAppService", "name": "autoscaleAppService", "type": "Microsoft.Insights/autoscalesettings", "location": "global", "tags": {}, "properties": { "enabled": true, "targetResourceUri": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/example-rg/providers/Microsoft.Web/sites/example-app", "profiles": [], "notifications": [] } }
--min-count - Sets the minimum number of instances
--max-count - Sets the maximum number of instances
--count - Sets the default number of instances
This command adds a rule to increase instances by 1 when average CPU usage is above 70% for 5 minutes.
Terminal
az monitor autoscale rule create --resource-group example-rg --autoscale-name autoscaleAppService --condition "Percentage CPU > 70 avg 5m" --scale out 1 --cooldown 5m
Expected OutputExpected
{"name":"autoscaleAppService","resourceGroup":"example-rg","rules":[{"metricTrigger":{"metricName":"Percentage CPU","timeAggregation":"Average","operator":"GreaterThan","threshold":70,"timeGrain":"PT1M","timeWindow":"PT5M"},"scaleAction":{"direction":"Increase","type":"ChangeCount","value":"1","cooldown":"PT5M"}}]}
--condition - Defines the metric and threshold for scaling
--scale - Defines scaling direction and amount
--cooldown - Wait time before next scaling action
This command adds a rule to decrease instances by 1 when average CPU usage is below 30% for 5 minutes.
Terminal
az monitor autoscale rule create --resource-group example-rg --autoscale-name autoscaleAppService --condition "Percentage CPU < 30 avg 5m" --scale in 1 --cooldown 5m
Expected OutputExpected
{"name":"autoscaleAppService","resourceGroup":"example-rg","rules":[{"metricTrigger":{"metricName":"Percentage CPU","timeAggregation":"Average","operator":"LessThan","threshold":30,"timeGrain":"PT1M","timeWindow":"PT5M"},"scaleAction":{"direction":"Decrease","type":"ChangeCount","value":"1","cooldown":"PT5M"}}]}
--condition - Defines the metric and threshold for scaling
--scale - Defines scaling direction and amount
--cooldown - Wait time before next scaling action
This command shows the current auto scale settings to verify the configuration.
Terminal
az monitor autoscale show --resource-group example-rg --name autoscaleAppService
Expected OutputExpected
{ "name": "autoscaleAppService", "profiles": [ { "name": "AutoScaleProfile", "capacity": { "minimum": "1", "maximum": "3", "default": "1" }, "rules": [ { "metricTrigger": { "metricName": "Percentage CPU", "timeAggregation": "Average", "operator": "GreaterThan", "threshold": 70, "timeGrain": "PT1M", "timeWindow": "PT5M" }, "scaleAction": { "direction": "Increase", "type": "ChangeCount", "value": "1", "cooldown": "PT5M" } }, { "metricTrigger": { "metricName": "Percentage CPU", "timeAggregation": "Average", "operator": "LessThan", "threshold": 30, "timeGrain": "PT1M", "timeWindow": "PT5M" }, "scaleAction": { "direction": "Decrease", "type": "ChangeCount", "value": "1", "cooldown": "PT5M" } } ] } ], "enabled": true, "targetResourceUri": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/example-rg/providers/Microsoft.Web/sites/example-app" }
Key Concept

If you remember nothing else from this pattern, remember: auto scaling adjusts your app's resources automatically based on real-time usage to keep it fast and cost-effective.

Common Mistakes
Setting minimum and maximum instance counts too close or equal.
This stops auto scaling from adding or removing instances, making it useless.
Set a minimum lower than maximum to allow scaling up and down.
Using very low or very high CPU thresholds for scaling rules.
Low thresholds cause too many scale actions; high thresholds delay scaling and cause slow app response.
Choose balanced thresholds like 30% for scale in and 70% for scale out.
Not setting cooldown periods between scaling actions.
Without cooldown, the system may scale up and down rapidly, causing instability and extra cost.
Always set cooldowns (e.g., 5 minutes) to give time for changes to take effect.
Summary
Create an auto scale setting with minimum, maximum, and default instance counts.
Add rules to increase or decrease instances based on CPU usage thresholds.
Verify the auto scale configuration to ensure it matches your needs.