PerformanceCaching

Read-Through Cache

Fetching the same data from a slow database on every request wastes latency and database capacity, but manually managing cache population across every call site is error-prone and hard to keep consistent.

Rickvian Aldi·Software engineer·5 min read

Problem

In read-heavy applications the database is rarely the right place for every read. A product catalog, a user profile, a configuration object - these are queried thousands of times per second but change infrequently. Sending every read to the database creates unnecessary load, drives up tail latency, and makes the database a bottleneck for workloads it was never designed to handle at that rate.

The naive fix - cache things manually - introduces its own problems. Every call site that fetches data must now remember to check the cache first, populate it on a miss, and invalidate it on write. When the logic is duplicated across dozens of services or hundreds of files, inconsistencies multiply. One call site forgets to invalidate. Another uses a different TTL. A third caches the wrong granularity. The cache becomes a source of subtle bugs rather than a source of speed.

Forces

Latency is uneven. A database query might take 5–50ms. A cache hit in Redis takes under 1ms. The difference is imperceptible in a batch job but decisive in a web request that chains several lookups.
Cache population is a cross-cutting concern. If every call site is responsible for its own cache logic, the complexity grows with the size of the codebase. Centralizing it means changing the TTL or eviction policy in one place affects all consumers.
Cold starts and cache stampedes. When a cache entry expires under load, many concurrent requests all miss simultaneously and all hit the database before any of them can repopulate. Without mitigation, this produces a thundering herd that can overload a database in seconds.
Consistency tolerance varies. Some data (stock prices, live scores) must never be stale. Other data (product descriptions, user preferences) can tolerate seconds or minutes of lag. The cache layer must accommodate this spectrum.

Solution

Wrap the data access layer so that the cache is the only entry point. The calling code asks the cache for the data; the cache either returns a hit or fetches from the database, stores the result, and returns it. The caller never knows whether the data came from cache or the database.

Implementation sketch:

class ReadThroughCache<T> {
  constructor(
    private readonly redis: Redis,
    private readonly ttlSeconds: number
  ) {}
 
  async get(key: string, fetch: () => Promise<T>): Promise<T> {
    const cached = await this.redis.get(key);
    if (cached !== null) {
      return JSON.parse(cached) as T;
    }
 
    const value = await fetch();
    await this.redis.setex(key, this.ttlSeconds, JSON.stringify(value));
    return value;
  }
 
  async invalidate(key: string): Promise<void> {
    await this.redis.del(key);
  }
}
 
// Usage
const userCache = new ReadThroughCache<User>(redis, 300); // 5-minute TTL
 
async function getUser(id: string): Promise<User> {
  return userCache.get(`user:${id}`, () => db.users.findById(id));
}

The fetch function is only called on a cache miss. This makes caching transparent: refactoring getUser to add caching requires changing exactly one place.

The best cache is one that your application doesn't know is there - a transparent layer that makes reads faster without leaking its existence into business logic.

Thundering herd mitigation: Use a mutex or probabilistic early expiration to prevent cache stampedes. One approach is "lock on miss" - before fetching from the database, set a short-lived lock key. If the lock is already held, wait briefly and retry the cache read. A second approach is jitter on TTLs: instead of setex(key, ttl, value), use setex(key, ttl + Math.random() * jitter, value) so cache entries don't all expire simultaneously.

Serialization: Cache keys should be deterministic and include all relevant dimensions. user:${id} is fine. user:${id}:locale:${locale} is appropriate when locale affects the response. Avoid encoding arbitrary query parameters - it produces unbounded key cardinality and fills the cache with one-off entries.

When NOT to Use

Write-heavy data. If a record changes many times per second, caching it adds overhead (every write must invalidate) without reducing database load (the TTL is so short the cache is rarely warm). Cache data that is read far more often than it is written.
Highly sensitive data. User financial records, medical data, or authentication tokens should not live in a shared cache where key isolation is imperfect. Prefer direct database access with query optimization.
When consistency is non-negotiable. If stale reads cause correctness bugs - inventory that shows available when it's actually zero, price discrepancies during checkout - cache with caution. Use write-through caching (update cache on every write) or no cache at all.
Small datasets that fit in a single database page. If the entire table fits in the database's buffer pool, adding a Redis layer adds a network hop without reducing disk I/O. Profile before caching.

The Feature Flag Kill Switch pattern is a natural companion: feature flags are one of the most-read, rarely-written data types in any application - a canonical use case for read-through caching. Caching flag evaluations with a short TTL (1–5 seconds) dramatically reduces the load on the flag store and makes flag evaluation nearly free.

References

Redis documentation. "Cache Patterns." redis.io/docs/manual/patterns/
Fitzpatrick, Brad. "Distributed caching with Memcached." Linux Journal, 2004.
Kleppmann, Martin. Designing Data-Intensive Applications. O'Reilly, 2017. Chapter 5 (Replication - caching discussion).
Thundering Herd - "mutex-based cache stampede prevention." Available in the dogpile library for Python.

caching performance database latency read-heavy

Related patterns

Feature Flag Kill Switch

Deploying code to production is irreversible in the short term - when something goes wrong, rolling back requires another deploy, which takes time and may have its own risks.

feature-flagsoperabilitydeploymentsafetydark-launching