Rate Limits

Our default rate limit in Production is 100 requests per second, while in the Sandbox environment, it is set at 25 requests per second. This limit applies cumulatively across all API keys and requests made to endpoints within your account. If the rate limit is exceeded, a 429 status code will be returned, indicating the limit has been reached.

You may encounter rate limits if you are:

  • Parallelizing requests across services,

  • Executing large batches of API calls simultaneously,

  • Handling a high volume of inbound webhooks that trigger API actions.

We advise designing your processes with the rate limit in mind as an upper bound across your services. Please review the recommendations below to develop a resilient integration with Senmo.

Handling Parallelized API Requests

You may exceed the rate limit if you are executing large, parallelized batches of API requests to Senmo or if multiple components of your application or internal system are making concurrent API calls. Consider the following guidelines to manage these situations effectively.

Custom retry logic

As a best practice, implement retry logic for your API requests. If you receive a 429 status code indicating 'Too Many Requests,' retry the request using jittered exponential backoff. This proven approach can be effectively combined with the following strategies. Additionally, ensure you use Idempotency Keys in your requests to prevent duplication.

Staggering API calls

You may choose to interact with the API in batches, either for backfilling data or at scheduled intervals. In such cases, you can implement a static throttling approach by executing requests with delayed batching logic, where each batch of 100 calls is separated by a delay (e.g., using sleep(), setTimeout(), etc.). To ensure safety, consider reducing the batch size, such as introducing a 1-second delay after every 80 requests.

If you are running other processes concurrently while the batch is executing, take those parallel API calls into account. To assist with this, the API responses include the X-Rate-Limit-Limit and X-Rate-Limit-Remaining headers, which provide information on your current rate limit status. Use these headers to set up a dynamic throttling mechanism, potentially pausing API jobs when X-Rate-Limit-Remaining reaches zero.

Locking

Another approach is to implement an atomic thread counter to create a locking mechanism that prevents your internal system from exceeding a defined thread-per-second threshold. This can be used alongside the staggering API calls method or independently.

Within each one-second interval, each new thread asynchronously increments the counter. Only initiate a new thread (or API call) if the counter is below the threshold (e.g., 80). If the counter reaches the limit, new threads cannot be started until the counter is reset. These threads should be retried or queued for execution at a later time.

Queueing

A robust solution is to route your system events into a queue for more controlled processing and enhanced auditing. You can use a queuing service such as Amazon SQS or Apache Kafka to handle these events. Process the queued messages in batches at your preferred polling interval, and then send the API requests to Senmo accordingly.

Last updated