Skip to main content

Retrying failed runs

Tower can automatically retry runs that fail due to infrastructure errors (errored) or application crashes (crashed). Retry policies let you configure how many times to retry, how long to wait between attempts, and whether to use exponential backoff.

Configuring a retry policy on an app

A retry policy set on an app applies to every future run of that app. Configure it via the Tower API when updating an app:

curl -X PATCH https://api.tower.dev/v1/apps/my-app \
-H "Authorization: Bearer <MY_API_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
"retry_policy": {
"max_retries": 3,
"retry_delay_seconds": 60,
"use_exponential_backoff": false
}
}'

You can also configure the retry policy from the app settings panel in the Tower UI.

Retry policy fields

FieldTypeDescriptionRange
max_retriesintegerMaximum number of retry attempts after the initial run. 0 disables retries.0–10
retry_delay_secondsintegerSeconds to wait before dispatching the next attempt.0–3600
use_exponential_backoffbooleanWhen true, doubles the delay with each retry attempt, capped at 1 hour.

Setting max_retries to 0 (the default) disables automatic retries entirely.

Exponential backoff

When use_exponential_backoff is true, the wait before the Nth retry is:

delay × 2^(N-1)

For example, with retry_delay_seconds: 30 and three retries:

RetryWait
1st30 seconds
2nd60 seconds
3rd120 seconds

The maximum wait between retries is 1 hour, regardless of the backoff factor.

Overriding the retry policy for a single run

You can override the app's retry policy for a specific run by passing retry_policy in the run request body:

curl -X POST https://api.tower.dev/v1/apps/my-app/runs \
-H "Authorization: Bearer <MY_API_TOKEN>" \
-H "Content-Type: application/json" \
-d '{
"retry_policy": {
"max_retries": 5,
"retry_delay_seconds": 30,
"use_exponential_backoff": true
}
}'

When a retry_policy is provided in the run request it takes effect only for that run; the app's default policy is not modified.

How retries work

When a run finishes with crashed or errored status and a retry policy is configured:

  1. Tower records the current attempt (its status, timing, and exit code) in the run's attempt history.
  2. The run transitions to retrying status and waits for the configured delay.
  3. After the delay, Tower dispatches the run again as a new attempt on the same run record.
  4. This repeats until the run succeeds (exited) or the attempt limit is exhausted.

A run in retrying status is considered active — it appears in active run counts and can be cancelled.

What is and is not retried

Final statusRetried?
crashedYes
erroredYes
exitedNo
cancelledNo

Tower also skips retries if the app is disabled at the time of the retry check.

Viewing attempt history

Each attempt within a run is recorded separately. The run's num_attempts field tells you how many attempts have been made. The full history is available in the run detail response under the attempts array (it is omitted from list responses):

{
"run_id": "abc123",
"status": "retrying",
"num_attempts": 2,
"retry_policy": {
"max_retries": 3,
"retry_delay_seconds": 60,
"use_exponential_backoff": false
},
"attempts": [
{
"seq": 1,
"status": "crashed",
"started_at": "2024-11-20T08:00:00Z",
"ended_at": "2024-11-20T08:01:05Z",
"exit_code": 1
}
]
}

The seq field is 1-based: seq: 1 is the original run, seq: 2 is the first retry, and so on.

Log lines and attempts

Each log line includes an attempt_seq field indicating which attempt produced it. This lets you filter or correlate logs with a specific retry attempt when inspecting run output.

Cancelling a retrying run

A run that is waiting to retry (status retrying) can be cancelled just like a pending or running run. Cancellation stops the retry cycle and marks the run as cancelled.

Removing a retry policy

To disable retries on an app, set max_retries to 0:

curl -X PATCH https://api.tower.dev/v1/apps/my-app \
-H "Authorization: Bearer <MY_API_TOKEN>" \
-H "Content-Type: application/json" \
-d '{"retry_policy": {"max_retries": 0, "retry_delay_seconds": 30, "use_exponential_backoff": false}}'