Autoscale Deployments run on cloud computing resources that scale up and down to efficiently handle the network traffic and workload of your Replit App. When your app is busy, autoscaling adds servers to manage the load. When your app is idle, it reduces the number to as low as zero to save you money.

Autoscale Deployments are ideal for the following use cases:

  • Web applications that handle variable workloads and traffic such as ecommerce sites
  • APIs and services

Features

Autoscale Deployments include the following features:

  • Automatic resource scaling: Automatically adjusts resources based on traffic patterns to optimize costs.
  • Custom domains: Configure a custom domain or use a <app-name>.replit.app URL to access your app.
  • Configurable limits: Set the maximum number of instances your deployment can scale to.
  • Flexible machine power: Choose the CPU and RAM configuration that meets your app’s needs.
  • Monitoring: View logs and monitor your deployment’s status.

Usage

You can access Autoscale Deployments in the Deployments workspace tool.

Autoscale configuration screen in the Deployments tool

Machine power

Select Edit to view and set the machine power options. Use the sliders to select the CPU and RAM configuration for each deployment server instance.

View the compute unit cost for the configuration in the Total per machine row. A compute unit is a measurement of cloud computing resources based on the memory and CPU configuration of the machine.

To learn more about calculating the cost based on Compute Units, see Compute Units.

Max number of machines

Use the slider to adjust the maximum number of machines. This number is the upper limit of server instances the autoscaling feature can assign when it determines your app is busy.

The bottom row shows the equivalent compute units, calculated by the following formula:

Number of machines * compute units per machine

Next steps