The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

How to handle contention at scale?

We’re building a system where an offer can be redeemed until a global limit is reached.
For example, an offer may allow a maximum total redeemable amount (e.g., 10M), or/and only allow each user to redeem once.

With 1000 requests/second may try to redeem the same offer.

On every /redeem call, we need to ensure:

redeemed_amount + request_amount <= max_redeemable

The key requirement is: do not exceed the global limit, even under very high concurrency.

Approach 1 — Distributed lock

We serialize all redemptions for a specific offer ID:

acquire lock(offer_id)
if redeemed + amount > limit:
    release lock
    return failure

redeemed += amount
update DB
release lock
return success

The problem:
With ~1000 concurrent redemption requests for the same offer, this becomes a hot lock and introduces a lot of contention and latency.

Approach 2 — Conditional update on the database

Something like:

UPDATE offers
SET redeemed = redeemed + 10
WHERE offer_id = ?
IF redeemed + 10 <= limit

We use Cassandra, so this requires an LWT.
Under heavy contention, LWT becomes slow and can cause timeouts and high latency.

Both approaches suffer from the same issue: high contention on a single shared counter.

What are the recommended ways to solve this kind of high-contention quota enforcement?
Are there standard patterns for handling counters or limits at scale without locking everything or relying heavily on transactions ?