Rolling time window counters with Redis and mitigating botnet-driven login attacks

This blog posts presents rolling time window counting and rate limiting in Redis. You can apply it to activate login CAPTCHA on your site only when it is needed. For the syntax highlighted Python source code please see the original blog post.

1. About Redis

redis-small

Redis is a key-value store and persistent cache. Besides normal get/set functionality it offers more complex data structures like lists, hashes and sorted sets. If you are familiar with memcached think Redis as memcached with steroids.

Often Redis is used for rate limiting purposes. Usually the rate limit recipes are count how many times something happens on a certain second or a certain minute. When the clock ticks to the next minute, rate limit counter is reset back to the zero. This might be problematic if you are looking to limit rates where hits per integration time window is very low. If you are looking to limit to the five hits per minute, in one time window you get just one hit and six in another, even though the average over two minutes is 3.5.

This posts presents an Python example how to do a rolling time window based counting, so that rate counting does not reset itself back to the zero in any point, but counts hits over X seconds to the past. This is achieved using Redis sorted sets.

2. rollingwindow.py:

If you know any better way to do this with Redis – please let me know – I am no expert here. This is the first implementation I figured out.

"""

    Redis rolling time window counter and rate limit.

    Use Redis sorted sets to do a rolling time window counters and limiters.

    http://redis.io/commands/zadd

"""

import time


def check(redis, key, window=60, limit=50):
    """ Do a rolling time window counter hit.

    :param redis: Redis client

    :param key: Redis key name we use to keep counter

    :param window: Rolling time window in seconds

    :param limit: Allowed operations per time window

    :return: True is the maximum limit has been reached for the current time window
    """

    # Expire old keys (hits)
    expires = time.time() - window
    redis.zremrangebyscore(key, '-inf', expires)

    # Add a hit on the very moment
    now = time.time()
    redis.zadd(key, now, now)

    # If we currently have more keys than limit,
    # then limit the action
    if redis.zcard(key) > limit:
        return True

    return False


def get(redis, key):
    """ Get the current hits per rolling time window.

    :param redis: Redis client

    :param key: Redis key name we use to keep counter

    :return: int, how many hits we have within the current rolling time window
    """
    return redis.zcard(key)

3. Problematic CAPTCHAs

Everybody of us hates CAPTCHAs. They are two-edged swords. On one hand, you need to keep bots out from your site. On the other, CAPTCHAs are turn off for your site visitors and they drive away potential users.

Even though the most popular CAPTCHA-as-a-service, Google’s reCAPTCHA, has made substantial progress to make CAPTCHAs  for real visitors and hard for bots, CAPTCHAs still present a usability problem. Also in the case of reCAPTCHA, JavaScript and image assets are loaded from Google front end services and they tend to get blocked in China, disabling your site for Chinese visitors.

4. CAPTCHAs and different login situations

There are three cases where you want the user to complete CAPTCHA for login

  • Somebody is bruteforcing a single username (targeted attack): you need to count logins per usename and not let the login proceed if this user is getting too many logins.
  • Somebody is going through username/password combinations for a single IP: you count logins per IP.
  • Somebody is going through username/password combinations and the attack comes from very large IP pool. Usually these are botnet-driven attacks and the attacker can easily have tens of thousands of IP addresses to burn.

The botnet-driven login attack is tricky to block. There might be only one login attempt from each IP. The only way to effectively stop the attack is to present pre-login CAPTCHA i.e. the user needs to solve the CAPTCHA even before the login can be attempted. However pre-login CAPTCHA is very annoying usability wise – it prevents you to use browser password manager for quick logins and sometimes gives you extra headache of two minutes before you get in to your favorite site.

Even services like CloudFlare do not help you here. Because there is only one request per single IP, they cannot know beforehand if the request is going to be legitimate or not (though they have some global heurestics and IP blacklists for sure). You can flip on the “challenge” on your site, so that every visitors must complete the CAPTCHA before they can access your site and this is usability let down again.

5. Mitigating botnet-driven login attack with on-situation CAPTCHA

You can have the best of the both worlds: no login CAPTCHA and still mitigate botnet-driven login atttacks. This can be done by

  • Monitoring your site login rate
  • In normal situation do not have pre-login CAPTCHA
  • When there is clearly an abnormal login rate, which means there might be an attack going on, enable the pre-login CAPTCHA for certain time

Below is an pseudo-Python example how this can be achieved with using rollingwindow Python module from the above.

6. captchamode.py

from redis_cache import get_redis_connection

import rollingwindow


#: Redis sorted set key counting login attempts
REDIS_LOGIN_ATTEMPTS_COUNTER = "login_attempts"

#: Key telling that CAPTCHA become activated due to
#: high login attempts rate
REDIS_CAPTCHA_ACTIVATED = "captcha_activated"

#: Captcha mode expires in 120 minutes (attack cooldown)
CAPTCHA_TIMEOUT = 120 * 60

#: Are you presented CAPTCHA when logging in first time
#: Disabled in unit tests.
LOGIN_ATTEMPTS_CHALLENGE_THRESHOLD = 500  # per minute


def clear():
    """ Resets the challenge system state, per system or per IP. """
    redis = get_redis_connection("redis")
    redis.delete(REDIS_CAPTCHA_ACTIVATED)
    redis.delete(REDIS_LOGIN_ATTEMPTS_COUNTER)


def get_login_rate():
    """
    :return: System global login rate per minute for metrics
    """
    redis = get_redis_connection("redis")
    return rollingwindow.get(redis, REDIS_LOGIN_ATTEMPTS_COUNTER)


def check_captcha_needed(redis):
    """ Check if we need to enable login CAPTCHA globally.

    Increase login page load/submit counter.

    :return: True if our threshold for login page loads per minute is exceeded
    """

    # Count a hit towards login rate
    threshold_exceeded = rollingwindow.check(redis, REDIS_LOGIN_ATTEMPTS_COUNTER, limit=LOGIN_ATTEMPTS_CHALLENGE_THRESHOLD)

    # Are we in attack mode
    if not redis.get(REDIS_CAPTCHA_ACTIVATED):

        if not threshold_exceeded:
            # No login rate threshold exceeded,
            # and currently CAPTCHA not activated ->
            # allow login without CAPTCHA
            return False

        # Login attempt threshold exceeded,
        # we might be under attack,
        # activate CAPTCHA mode
        redis.setex(REDIS_CAPTCHA_ACTIVATED, "true", CAPTCHA_TIMEOUT)

    return True


def login(request):

    redis = get_redis_connection("redis")

    if check_captcha_needed(request):
        # ... We need to CAPTCHA before this login can proceed ..
    else:
        # ... Allow login to proceed without CAPTCHA ...

 

 

 

 

\"\" Subscribe to RSS feed Follow me on Twitter Follow me on Facebook Follow me Google+

4 thoughts on “Rolling time window counters with Redis and mitigating botnet-driven login attacks

  1. In rollingwindow.py: I think you want to add the new hit *after* checking .card(), because otherwise you can end up a situation where you permanently starve yourself.

  2. We implemented something similar using Redis’s list object. I’d be interested to know the performance tradeoff as I assume a sorted set is more expensive.

Leave a Reply

Your email address will not be published. Required fields are marked *