Session management at 100k MAU

Your session table is the first thing that breaks at real scale. Four things to fix before you get there.

· LoginWith team

At 100,000 monthly active users, your auth system is handling maybe 200-500 requests per second at peak. That’s still not a huge number — but a few specific pieces of the session infrastructure start to matter in ways they didn’t at 10k.

1. Session table growth

A typical session lasts 30 days. With 100k MAU each having 1-3 active devices and legacy rows from lapsed users, your sessions table easily hits 500k-1M rows. Unexamined, it grows indefinitely.

Index strategy:

  • Covering index on (user_id, expires_at, session_id) for “fetch active sessions for a user”
  • Partial index where expires_at > NOW() for hot-path lookups
  • BRIN index on created_at if you do time-range queries

Cleanup:

  • Nightly sweep: DELETE FROM sessions WHERE expires_at < NOW() - INTERVAL '7 days'
  • Keep recently expired rows for a few days to support debugging (support can look up why a user got logged out)

2. Session lookup latency

At 500k rows in Postgres, a primary-key lookup is still sub-millisecond. The cost grows if:

  • Your WHERE clause isn’t indexed (rare if you index properly)
  • You’re ORDER-BY’ing without an index
  • You’re reading related tables in the same query

Consider moving to Redis/Valkey once latency matters. Keyspace: session:{id} → JSON blob. TTL on the key for automatic expiration. Look up once, attach to the request context. Sub-millisecond P99.

The tradeoff: Redis sessions are harder to audit after the fact (“what sessions did this user have yesterday?”). Keep a mirror in Postgres if that matters for compliance.

3. Multi-device session management

Users at scale expect to see “your active sessions” with device names and locations, and to terminate individual devices. This requires structured session metadata:

session_id, user_id, device_name, user_agent, ip_country,
last_active_at, created_at, expires_at

Populate device_name on login (parse user agent for something readable). Update last_active_at on every request (or every N minutes — per-request updates can become a write bottleneck).

4. Session sync across regions

If you’re running multi-region for latency, session lookups need to be local. Options:

  • Replicate the session store. Redis with cross-region replication. Latency cost: session writes are slower, but reads are local.
  • Issue short-lived signed JWTs from sessions. The client presents the JWT (locally verified), the JWT has a reference to the session (fetched lazily if you need revocation). Combines the best of both.
  • Pin users to a home region. Simpler, but users traveling abroad hit cross-region latency.

The right answer depends on your product’s geographic distribution and latency tolerance.

The summary

None of this is exotic. The pattern is: observe your session system as a datastore in its own right, index it, clean it, and move to a purpose-built store when the DB query latency shows up on your P99 graphs. Most teams do this work reactively. Do it proactively and you skip the 3am incident.

Want auth that just works?

Get started with LoginWith