Skip to content
Feb 28

Radix Sort

MT
Mindli Team

AI-Generated Content

Radix Sort

Radix sort is a non-comparative integer sorting algorithm that can achieve linear-time performance, a feat impossible for algorithms like Quicksort or Merge Sort. By processing numbers digit by digit, it sidesteps the lower bound that governs comparison-based sorts, making it exceptionally fast for sorting large collections of integers or fixed-length strings where the keys have a bounded number of digits. Its efficiency underpins sorting operations in databases, memory-constrained systems, and applications involving lexicographic ordering.

How Radix Sort Works: The Core Idea

Unlike algorithms that compare entire keys against each other, radix sort decomposes keys into their individual digits or characters and processes them sequentially. It relies on a stable sort—a sorting method that preserves the relative order of items with equal keys—as a subroutine to sort the entire list repeatedly, one digit position at a time. The most common implementation sorts from the least significant digit (LSD) to the most significant digit (MSD), though MSD-first variations exist.

Think of it like sorting a deck of cards. A comparison sort looks at two whole cards and decides which is "bigger." Radix sort, however, might first sort all cards by their suit (clubs, diamonds, hearts, spades) while keeping cards within each suit in their original order. Then, within those stable groups, it sorts by rank. The final result is a perfectly ordered deck. In computing, the "suit" and "rank" are analogous to different digit positions in a number.

The Subroutine: Counting Sort

Radix sort is almost always implemented using counting sort as its stable sorting subroutine. Counting sort is not a comparison sort; it works by counting the number of objects that have each distinct key value. For a given digit position (e.g., the ones place), the algorithm:

  1. Creates a count array to tally how many numbers have each possible digit (0-9).
  2. Transforms this count array into a cumulative count, which determines the correct output positions.
  3. Builds the output array by placing each element in its calculated position, then decrementing the cumulative count to ensure stability.

This process runs in time, where is the number of items and is the base or radix (e.g., 10 for decimal). Because counting sort’s time is linear, using it repeatedly for each digit keeps radix sort efficient.

Step-by-Step Walkthrough of LSD Radix Sort

Let's sort the list [170, 45, 75, 90, 802, 24, 2, 66] using LSD radix sort with a decimal (base-10) system.

Pass 1: Sort by the least significant digit (ones place). Original: 170, 45, 75, 90, 802, 24, 2, 66 Sorted by ones digit: [170, 90, 802, 2, 24, 45, 75, 66] (Note: 170, 90, and 802 all have a ones digit of 0; they stay in their original relative order due to stability.)

Pass 2: Sort by the tens digit. Input from Pass 1: 170, 90, 802, 2, 24, 45, 75, 66 (For numbers like 2, treat the tens digit as 0). Sorted by tens digit: [802, 2, 24, 45, 66, 170, 75, 90]

Pass 3: Sort by the hundreds digit. Input from Pass 2: 802, 2, 24, 45, 66, 170, 75, 90 Sorted by hundreds digit: [2, 24, 45, 66, 75, 90, 170, 802]

The list is now fully sorted. Each pass used a stable counting sort on a single digit column.

Time and Space Complexity Analysis

The time complexity of radix sort is , where:

  • is the number of elements to sort.
  • is the average number of digits (or key length) in the elements.

This complexity comes from performing passes of counting sort, each taking time, where is the radix (base). Since is a constant (like 10 or 256), it simplifies to per pass, for passes: .

Whether this outperforms an comparison sort depends on . If is constant and small (e.g., sorting 32-bit integers, where for base-10), radix sort runs in linear time, which is asymptotically faster. However, if keys are arbitrarily long, can dominate. The space complexity is , primarily due to the auxiliary arrays used by the counting sort subroutine.

Applications and Variations

Radix sort excels in specific, real-world scenarios:

  • Database Sorting: Sorting large tables by integer keys or fixed-length strings.
  • String Sorting: As a component of suffix array construction algorithms.
  • Card Sorter Emulation: Its mechanism directly mimics old mechanical card sorting machines.

Variations include:

  • MSD Radix Sort: Sorts from the most significant digit downward. It can be faster as it can create partitions and skip sorting irrelevant tails, but it is more complex to implement and may not be stable without extra care.
  • Different Bases: Using a larger radix (e.g., 256 instead of 10) reduces the number of passes but increases the space and time per pass for the counting array. There is an optimal base that minimizes total runtime.

Common Pitfalls

  1. Confusing LSD and MSD Order: A frequent conceptual error is sorting from the most significant digit first without using a recursive or partitioning approach. LSD is generally simpler to implement correctly because it uniformly processes all digits.
  2. Using an Unstable Subroutine: The entire algorithm fails if the subroutine (like counting sort) is not stable. If items with the same digit get reordered arbitrarily, the work from previous passes is destroyed.
  3. Misapplying to Non-Integer Data: Radix sort, in its basic form, is designed for integer keys. Applying it directly to floating-point numbers or variable-length strings without careful normalization (like padding strings) leads to incorrect results.
  4. Ignoring Space Complexity: While fast, radix sort requires additional space. In extremely memory-constrained environments (embedded systems), an in-place comparison sort might be preferable despite its slower time complexity.

Summary

  • Radix sort is a non-comparative integer sorting algorithm that processes numbers digit by digit, typically from least significant to most significant (LSD).
  • It uses a stable sort, most often counting sort, as a subroutine for each digit pass, achieving a time complexity of , where is items and is digit length.
  • For integers with bounded key lengths (like 32-bit ints), it achieves linear-time performance, outperforming comparison-based sorts like Quicksort in practice for large datasets.
  • Its primary applications include database indexing, string sorting, and other domains where keys are fixed-length integers or strings.
  • Successful implementation requires attention to detail: ensuring the subroutine's stability, handling digit extraction correctly, and understanding the space trade-offs.

Write better notes with AI

Mindli helps you capture, organize, and master any subject with AI-powered summaries and flashcards.