Skip to main content
Flow Control enables you to limit the number of messages sent to your endpoint via delaying the delivery.
  • Rate: You can specify a maximum number of calls that can be made to your endpoint within a certain time period.
  • Parallelism: You can set a limit on the number of concurrent calls to your endpoint.
You can use either of these limits or combine them to have more control over the flow of messages to your endpoint. If any of the limits is exceeded, additional messages will be added to waitlist and delivered once either the time period has passed (for rate limit) or the number of active calls drops below the limit (for parallelism limit).
1

Choose Flow Control Key

To use flow control, you need to choose a key first. This key is used to count the number of calls made to your endpoint.The limits are applied per flow-control key, not per URL. This means that you can use the same key for different URLs to apply the same limits to them.
There are no limits to number of keys you can use.
2

Decide Limits

Decide which limits you want to apply. You can choose to apply only rate limit, only parallelism limit, or both.For instance, if you want to limit the number of calls to 10 per minute, you can set the rate to 10 and the period to 1 minute. If you want to limit the number of concurrent calls to 5, you can set the parallelism limit to 5.
3

Send a Message

const client = new Client({ token: "<QSTASH_TOKEN>" });

await client.publishJSON({
    url: "https://example.com",
    body: { hello: "world" },
    flowControl: { key: "USER_GIVEN_KEY", parallelism: 5, rate: 10, period: "1m" },
});
🎉 That’s it! From now on, QStash will enforce these limits by counting the number of messages associated with this flow-control key.

Rate and Period Parameters

The rate parameter specifies the maximum number of calls allowed within a given period. The period parameter allows you to specify the time window over which the rate limit is enforced. By default, the period is set to 1 second, but you can adjust it to control how frequently calls are allowed. For example, you can set a rate of 10 calls per minute as follows:
const client = new Client({ token: "<QSTASH_TOKEN>" });

await client.publishJSON({
    url: "https://example.com",
    body: { hello: "world" },
    flowControl: { key: "USER_GIVEN_KEY", rate: 10, period: "1m" },
});

Parallelism Limit

The parallelism limit is the number of calls that can be active at the same time. Active means that the call is made to your endpoint and the response is not received yet. You can set the parallelism limit to 10 calls active at the same time as follows:
const client = new Client({ token: "<QSTASH_TOKEN>" });

await client.publishJSON({
    url: "https://example.com",
    body: { hello: "world" },
    flowControl: { key: "USER_GIVEN_KEY", parallelism: 10 },
});
You can also use the Rest API to get information how many messages waiting for parallelism limit. See the API documentation for more details.

Rate, Parallelism, and Period Together

All three parameters can be combined. For example, with a rate of 10 per minute, parallelism of 20, and a period of 1 minute, QStash will trigger 10 calls in the first minute and another 10 in the next. Since none of them will have finished, the system will wait until one completes before triggering another.
const client = new Client({ token: "<QSTASH_TOKEN>" });

await client.publishJSON({
    url: "https://example.com",
    body: { hello: "world" },
    flowControl: { key: "USER_GIVEN_KEY", rate: 10, parallelism: 20, period: "1m" },
});

Management API

You can inspect flow control keys programmatically using the flowControl namespace on the client.

Get a single flow control key

Returns the current state and metrics for one flow control key.
import { Client } from "@upstash/qstash";

const client = new Client({ token: "<QSTASH_TOKEN>" });

const info = await client.flowControl.get("USER_GIVEN_KEY");
console.log(info);
// {
//   flowControlKey: "USER_GIVEN_KEY",
//   waitListSize: 5,
//   parallelismMax: 10,
//   parallelismCount: 3,
//   rateMax: 100,
//   rateCount: 42,
//   ratePeriod: 60,
//   ratePeriodStart: 1708000000
// }
The response fields are:
FieldDescription
flowControlKeyThe flow control key name
waitListSizeNumber of messages currently waiting in the queue
parallelismMaxConfigured maximum concurrent messages (if set)
parallelismCountNumber of messages currently running in parallel
rateMaxConfigured maximum messages per rate period (if set)
rateCountNumber of messages dispatched in the current rate period
ratePeriodRate period length in seconds
ratePeriodStartUnix timestamp when the current rate period started

Get global parallelism

Returns the global parallelism usage across all flow control keys.
import { Client } from "@upstash/qstash";

const client = new Client({ token: "<QSTASH_TOKEN>" });

const info = await client.flowControl.getGlobalParallelism();
console.log(info);
// {
//   parallelismMax: 500,
//   parallelismCount: 42
// }
FieldDescription
parallelismMaxThe maximum global parallelism
parallelismCountThe current number of active requests globally

Monitor

You can monitor wait list size of your flow control key’s from the console FlowControl tab.
Also you can get the same info using the REST API.