Skip to content
Feb 25

LRU Cache Implementation

MT
Mindli Team

AI-Generated Content

LRU Cache Implementation

In systems where speed is paramount, caching is the silent workhorse that prevents performance bottlenecks. An LRU (Least Recently Used) cache is a specific eviction policy that ensures the most relevant data stays accessible by discarding items that haven't been touched the longest. Mastering its implementation is a rite of passage for engineers, as it elegantly solves a universal problem: managing limited, fast memory in the face of unlimited, slower storage.

Understanding the LRU Eviction Policy

At its core, an LRU cache is a fixed-size container for key-value pairs. When you insert a new item and the cache is at capacity, it must evict, or remove, an existing item to make space. The LRU eviction policy dictates that the item chosen for removal is the one that was accessed least recently. "Accessed" means either a get (retrieval) or a put (insert/update) operation. This policy is based on the temporal locality principle—the idea that data used recently is likely to be used again soon. Imagine a kitchen counter with space for only five tools; you naturally relegate the spatula you haven't used in weeks to the drawer when you need to make room for a new whisk. The cache operates on the same logic, automating this decision to optimize for frequent access.

The Performance Imperative: O(1) Operations

The utility of a cache vanishes if using it is slow. Therefore, a proper LRU cache implementation must guarantee constant time complexity for its core operations. This means both the get(key) and put(key, value) methods must run in time. In Big O notation, means the time to complete the operation does not depend on the number of items, or , in the cache. Achieving for get is straightforward with a hash map (or dictionary), which provides direct key-based lookup. However, tracking "least recent use" and reorganizing items upon access is the real challenge. A naive array or list would require time to scan for items or to shift elements during updates, which is unacceptable for high-performance systems. The design challenge is thus to maintain perfect lookup speed while also maintaining perfect recency tracking.

The Core Data Structure Synergy

The canonical solution combines two fundamental data structures: a hash map and a doubly linked list. This combination is a classic example of a composite data structure, where each component compensates for the other's weakness.

  • Hash Map ( Lookup): The hash map stores keys that map to nodes within the doubly linked list. This gives you an immediate pointer to the exact list node containing the value for a given key.
  • Doubly Linked List ( Recency Order): The list maintains all cache items in order of their use. The most recently used (MRU) item is at the head (front) of the list, and the least recently used (LRU) item is at the tail (back). Each node contains the key-value pair and has pointers to both the next and previous nodes.

The synergy is powerful: the hash map finds a node in constant time, and the doubly linked list allows us to move that node to the front (upon access) or remove the last node (upon eviction) also in constant time, thanks to the direct references provided by the pointers.

Step-by-Step Implementation Mechanics

Let's walk through how the get and put operations work with our two-structure design. We'll assume a cache with a fixed capacity c.

Initialization: Create an empty hash map and an empty doubly linked list. The list will have dummy head and tail nodes to simplify edge-case handling during insertions and deletions.

Operation 1: get(key)

  1. Check the hash map for the key.
  2. If the key is not found, return a sentinel value (e.g., -1).
  3. If found, the hash map gives us the linked list node.
  4. This is the crucial recency update: Remove the node from its current position in the list and re-insert it immediately after the dummy head node, making it the new MRU item.
  5. Return the value from the node.

Every get that hits the cache actively updates the recency order. The removal and re-insertion in a doubly linked list are operations because you have direct references to the node, its previous node, and its next node.

Operation 2: put(key, value)

  1. First, check if the key exists using get(key). If it does, the get operation will have already moved it to the front. You then simply update the node's value.
  2. If the key is new:
  • Create a new node with the key-value pair.
  • Insert this new node right after the dummy head (it's now the MRU).
  • Add the key and a reference to this new node into the hash map.
  • Eviction Check: If the cache has exceeded its capacity c, you must evict the LRU item.
  • The LRU item is the node right before the dummy tail node.
  • Remove this node from the linked list.
  • Delete its corresponding key from the hash map.

This process ensures that both insertion and eviction are operations. The eviction is a simple removal from the tail of the list and a hash map deletion, neither of which requires searching.

LRU as a Systems Design Pattern

Viewing LRU cache implementation merely as a coding problem misses its broader significance. It is a fundamental systems design pattern encountered from hardware to distributed systems. The CPU has its L1, L2 caches using similar policies. Database management systems use buffer pools with LRU variants to keep hot data in memory. Content delivery networks (CDNs) and web browsers use it to cache assets. Understanding the hash-map-and-linked-list blueprint gives you a template for designing any system component that must manage state under space constraints while prioritizing speed. It teaches you to think in terms of trading space (the overhead of two data structures) for time (constant operations), a quintessential engineering trade-off.

Common Pitfalls

  1. Using a List for Recency Without Direct Access: Implementing the recency order with a Python list or Java ArrayList leads to operations for put and get, as inserting at the front or removing an element requires shifting all subsequent items. Correction: Always use a doubly linked list where nodes have explicit prev and next pointers, enabling constant-time detachment and reattachment.
  1. Forgetting to Update the Hash Map on Eviction: When you remove the LRU node from the linked list, you must also remove its key from the hash map. Failing to do this leaves dangling references and effectively corrupts the cache state. Correction: Treat the hash map and linked list as a single logical unit. Any structural change to the list (like an eviction) must be synchronized with the hash map.
  1. Not Handling Node Updates in get: A successful get must promote the item to most recently used. If you only return the value without moving the node in the list, your recency chain becomes invalid, and future evictions will remove the wrong item. Correction: Always include the "unlink and move to head" step in your get method when a key is found.
  1. Ignoring Concurrency: The standard implementation is not thread-safe. Concurrent get and put operations from multiple threads can corrupt the linked list pointers. Correction: For production systems, you must employ synchronization mechanisms like locks or explore concurrent data structure designs, acknowledging that this may introduce some performance overhead.

Summary

  • An LRU cache is defined by its eviction policy: when full, it discards the least recently accessed item to make space for new ones.
  • Efficient implementation requires time for both get and put operations, achieved by combining a hash map for instant key lookup with a doubly linked list for constant-time recency ordering.
  • The get operation must actively update the list by moving the accessed node to the front, while put may trigger an eviction by removing the node at the tail.
  • This design is more than an algorithm; it's a systems design pattern critical for building performant applications, from databases to web services.
  • Key implementation pitfalls include using slow data structures for the order, failing to synchronize the hash map and list during eviction, and neglecting to update recency on data retrieval.

Write better notes with AI

Mindli helps you capture, organize, and master any subject with AI-powered summaries and flashcards.