Cyclic Sort Pattern
AI-Generated Content
Cyclic Sort Pattern
When dealing with arrays containing numbers in a specific, often consecutive range, brute force or general-purpose sorting algorithms like QuickSort are inefficient overkill. The Cyclic Sort pattern is a specialized, in-place algorithm that exploits the known range to place each element at its correct index in a single pass. By strategically swapping elements to their target positions, it achieves linear time complexity and constant space, making it the optimal tool for a class of common coding problems involving missing numbers, duplicates, and corrupt data.
Understanding the Core Intuition
The fundamental premise of cyclic sort is that if you have an array of elements containing numbers from a known range (e.g., to ), each number has a "correct" index. For a range starting at 1, the number x belongs at index x - 1. For a range starting at 0, the number x belongs at index x. This predictable mapping is the engine of the algorithm. Instead of comparing elements to each other, we compare each number to its destined position. If the number is not in its correct spot, we swap it with the element currently occupying that spot. Crucially, we only move to the next index after the current position holds its correct number, which may require multiple swaps at a single index.
Consider a simple example with the array [3, 1, 2] for numbers 1 to 3. At index 0, we find 3. Its correct index is 3 - 1 = 2. We swap arr[0] with arr[2], resulting in [2, 1, 3]. Now, at index 0, we have 2. Its correct index is 1. We swap again: [1, 2, 3]. Now index 0 holds 1, which is correct. We then increment to index 1 and index 2, which are already correct, completing the sort.
The Algorithm Mechanics
The standard implementation uses a single while loop that traverses the array. The process is best understood as a cycle: pick an element and keep swapping it to its correct position until the cycle is completed (i.e., the correct element lands in the current spot), then move the pointer forward.
Here is the step-by-step procedure for a range from 1 to n:
- Initialize a pointer
i = 0. - While
i < n:
a. Calculate the correct index for the value at arr[i]. For range 1 to n, correct_index = arr[i] - 1.
b. Check: Is arr[i] at its correct index? That is, is arr[i] equal to arr[correct_index]?
- If NO: Swap
arr[i]witharr[correct_index]. Do not incrementi. The new element now atineeds to be processed. - If YES: The element at index
iis correctly placed. Incrementiby 1.
- The loop terminates when
ireaches the end of the array.
The time complexity is . Even though we have nested loops in structure (a while and an internal swap process), each element is swapped at most once to reach its final position, leading to a linear total number of operations. The space complexity is as we sort in place using only a few variables.
Application 1: Finding the Missing Number
A classic problem is: "Given an array of n distinct numbers taken from the range 0 to n, find the one missing number." After applying cyclic sort, every number x will sit at index x. The missing number will be revealed because its correct index will be occupied by a different number. You can then perform a second linear scan: the first index i where arr[i] != i is the missing number. If all indices match their values, then n itself is missing.
Example: For nums = [4, 0, 3, 1] with (range 0-4). After cyclic sort, the array becomes [0, 1, 3, 4]. The scan finds that at index 2, nums[2] = 3 != 2. Therefore, the missing number is 2.
Application 2: Finding All Duplicates & Missing Numbers
Problems become more interesting when arrays contain duplicates or multiple missing numbers. The pattern remains the same: use cyclic sort to place numbers in their correct positions. Once sorted, any number not at its correct index is a clue. A second pass then interprets these clues.
For "Find All Duplicates in an Array" (numbers 1 to n, some appear twice), the process is:
- Apply cyclic sort.
- Iterate through the sorted array. Any element where
arr[i] != i + 1is out of place. However, because the duplicate has nowhere to go, the value at indexiwill be the duplicate number itself. You can collect these values.
For "Find All Numbers Disappeared in an Array" (numbers 1 to n, some missing, some duplicated):
- Apply cyclic sort.
- Iterate through the sorted array. For each index
i, ifarr[i] != i + 1, then the numberi + 1is missing. The value currently atarr[i]is a duplicate that has been placed there.
Application 3: Finding the Corrupt Pair
The "Corrupt Pair" problem asks: given an array of n numbers from 1 to n, one number is duplicated, and one number is missing. Find both. This is a direct combination of the previous applications.
- Sort: Use cyclic sort to place numbers.
- Identify: Traverse the sorted array. The index
iwherearr[i] != i + 1holds the key.
- The duplicate number is the value currently sitting at index
i(arr[i]). - The missing number is the index itself plus one (
i + 1).
Example: nums = [3, 1, 2, 5, 2] for range 1-5. After cyclic sort, it becomes [1, 2, 3, 2, 5]. At index 3 (0-based), nums[3] = 2, but the correct number should be 4.
- Duplicate:
nums[3] = 2 - Missing:
i + 1 = 4
Common Pitfalls
- Incorrect Index Calculation: The most frequent error is miscalculating the correct index, especially with ranges that start at 0 versus 1. Always confirm the problem's stated range. For an array of size
ncontaining numbers from1ton, the correct index isarr[i] - 1. For a range from0ton-1, the correct index is simplyarr[i].
- Infinite Loops with Duplicates: When duplicates exist, a naive swap can cause an infinite loop. If
arr[i]should go to indexcorrect, butarr[correct]already holds the same value (arr[i] == arr[correct]), swapping is pointless—you're swapping identical values. The algorithm must include a check to skip the swap and incrementiin this scenario to avoid getting stuck.
- Forgetting the Final Scan: A common oversight is thinking the answer is revealed during the sort. The cyclic sort phase is just to organize the data. The solution is almost always found in a subsequent, separate linear scan of the now-partially-sorted array. Don't try to overcomplicate the sorting loop to also find the answer.
- Assuming Sorted Input: Never assume the input is nearly sorted. The power of cyclic sort is that it efficiently handles the worst-case scenarios. Your implementation must follow the swap logic rigorously, not just check if an element is one off from its position.
Summary
- The Cyclic Sort pattern is an in-place, time algorithm used when dealing with arrays containing numbers in a defined, consecutive range (e.g.,
1ton). - It works by repeatedly placing the current element at its correct index (calculated as
value - 1for a 1-based range) via swaps, only advancing the pointer once the current index holds its rightful number. - Its primary utility is not just sorting, but as a preprocessing step to efficiently solve problems involving missing numbers, duplicates, and corrupt pairs with a simple follow-up scan.
- In interview settings, recognizing the constraint "array contains numbers from
1ton" is a strong signal to consider cyclic sort. Always clarify the range and watch for edge cases involving duplicate values to avoid infinite loops. - Mastery of this pattern turns seemingly complex array problems into a straightforward two-phase process: (1) cycle sort the array, and (2) interpret the sorted array to find the answer.