Skip to content
Feb 25

DP: Longest Increasing Subsequence

MT
Mindli Team

AI-Generated Content

DP: Longest Increasing Subsequence

Finding the longest increasing subsequence (LIS) within a sequence is a cornerstone problem in computer science, bridging fundamental algorithm design with powerful real-world applications. Whether you're analyzing time-series data, optimizing computational biology alignments, or simply preparing for a technical interview, mastering LIS provides you with deep insights into dynamic programming (DP) and elegant problem optimization. This guide will take you from the intuitive but slow solution to the highly efficient algorithm, ensuring you understand not just how to find the length, but how to reconstruct the actual subsequence and adapt to common variants.

From Brute Force to Dynamic Programming

The naive approach to find the Longest Increasing Subsequence—a subsequence where each element is larger than the previous one and the order is preserved from the original sequence—would be to check every possible subsequence, resulting in an exponential runtime. This is intractable for all but the smallest inputs. The first major leap in efficiency comes from applying dynamic programming.

The core DP insight is to define a state: let dp[i] represent the length of the Longest Increasing Subsequence that ends with the element at index i. The recurrence relation builds on previous solutions: to compute dp[i], you look at all indices j < i. If nums[j] < nums[i], then you can append nums[i] to the subsequence ending at j. Therefore, dp[i] = max(dp[j] for all j < i where nums[j] < nums[i]) + 1. The base case is that each individual element is an LIS of length 1, so all dp[i] start at 1.

Example: For the sequence nums = [10, 9, 2, 5, 3, 7, 101, 18]:

  • dp[0] (for 10): Only itself, so length 1.
  • dp[2] (for 2): Only itself, length 1.
  • dp[3] (for 5): We check j=0,1,2. Only nums[2]=2 < 5. So dp[3] = dp[2] + 1 = 2.
  • dp[5] (for 7): We check previous indices. 2, 5, and 3 are all less than 7. The maximum dp[j] among these is dp[3]=2 (for subsequence [2,5]). So dp[5] = 2 + 1 = 3 (subsequence [2,5,7]).

The final answer is the maximum value in the dp array. This algorithm runs in time and uses space. It forms the essential foundation for understanding the problem's structure.

Optimizing with Patience Sorting and Binary Search

While the DP solution is a significant improvement, we can do far better with an approach using a clever paradigm called patience sorting. The key is to maintain an auxiliary array that represents the smallest possible tail element for all increasing subsequences of a given length.

Here’s the algorithm:

  1. Initialize an empty array tails.
  2. For each number x in the input sequence:

a. If x is larger than all elements in tails, append x. This extends the longest subsequence we've found. b. Otherwise, find the smallest element in tails that is greater than or equal to x using binary search, and replace it with x. This step maintains the invariant that tails[i] holds the smallest possible tail for an increasing subsequence of length i+1.

The length of the tails array at the end is the length of the LIS. Crucially, the tails array itself is always sorted, enabling the binary search step.

Example Trace: Using nums = [10, 9, 2, 5, 3, 7, 101, 18].

  • Process 10: tails = [10]
  • Process 9: Binary search finds 10 >= 9. Replace: tails = [9]
  • Process 2: Binary search finds 9 >= 2. Replace: tails = [2]
  • Process 5: 5 > 2, append: tails = [2, 5]
  • Process 3: Binary search finds 5 >= 3. Replace: tails = [2, 3]
  • Process 7: 7 > 3, append: tails = [2, 3, 7]
  • Process 101: 101 > 7, append: tails = [2, 3, 7, 101]
  • Process 18: Binary search finds 101 >= 18. Replace: tails = [2, 3, 7, 18]

Final LIS length is 4. Notice how replacing 101 with 18 doesn't change the length but allows for potentially longer subsequences later (though none occur here). This algorithm is efficient because each of the n elements requires a binary search.

Reconstructing the Subsequence

Finding the length is often sufficient, but many applications require you to output the actual subsequence. Reconstruction is straightforward with the DP solution by keeping a prev pointer array to trace back from the index with the maximum dp value. With the method, it's trickier because the tails array does not store the subsequence directly.

To reconstruct using the optimized method, you need additional bookkeeping during the binary search step. Alongside the tails array, maintain a tails_indices array that stores the index in the original array of the element at each position in tails. Also, maintain a prev array for the original sequence. When you append to tails, record that the current element's predecessor is the element at the previous tails index. When you replace an element in tails, update the predecessor link accordingly. Once processed, you can backtrack from the index stored in the last position of tails_indices through the prev links to build the LIS in reverse order. This maintains the runtime while enabling reconstruction.

Handling Variants and Key Applications

The classic LIS problem asks for a strictly increasing subsequence. Common variants include:

  • Longest Non-Decreasing Subsequence: Here, duplicates are allowed in the subsequence. The adjustment is simple: in the DP solution, change the condition to nums[j] <= nums[i]. In the patience sorting solution, during binary search, find the smallest element in tails that is strictly greater than x (not greater than or equal), so that equal elements extend the subsequence rather than replacing an existing tail.
  • LIS in 2D (Envelopes Problem): A powerful extension involves sorting one dimension and then finding the LIS on the other, a common pattern in scheduling and geometric packing problems.

The applications of LIS are vast. In operations research, it models optimal scheduling of jobs with precedence constraints. In computational biology, it appears in sequence alignment. In file systems, it's related to the "minimum number of increasing subsequences" problem. Mastering LIS equips you with a versatile tool for combinatorial optimization challenges where you need to select an ordered subset under constraints.

Common Pitfalls

  1. Confusing Subsequence with Subarray: A subsequence does not need to be contiguous, while a subarray does. You cannot apply sliding window techniques directly to LIS. Always remember you can "skip" elements in the original sequence.
  2. Incorrect Binary Search Condition in the O(n log n) method: For the standard strictly increasing LIS, you must find the leftmost element in tails that is >= x. Using a strictly greater than (>) condition will fail for certain cases. For the non-decreasing variant, you need to find the leftmost element > x. Mixing these up is a frequent source of subtle bugs.
  3. Assuming the tails Array is the LIS: In the optimized algorithm, the tails array at the end contains the smallest possible tails, but it is not necessarily a valid subsequence from the input (as seen in the example where 101 was replaced by 18). Use it only for calculating the length unless you've implemented the full reconstruction bookkeeping.
  4. Overlooking Reconstruction Complexity: In an interview setting, stating you can find the length in time is good, but being asked to reconstruct it and failing to describe the additional arrays (prev, tails_indices) shows an incomplete understanding. Always be prepared to discuss both.

Summary

  • The Longest Increasing Subsequence (LIS) problem is a classic dynamic programming challenge with an even more efficient optimization using patience sorting.
  • The DP solution defines dp[i] as the LIS length ending at index i, with a recurrence that checks all previous smaller elements.
  • The optimal solution maintains a tails array, using binary search to keep the smallest possible tail for each subsequence length, yielding just the LIS length.
  • Reconstructing the actual subsequence requires additional arrays to track predecessor links during the optimized algorithm's execution.
  • Variants like the non-decreasing subsequence require careful adjustment of the comparison operators in both DP and binary search conditions, and the techniques connect directly to practical problems in scheduling and optimization.

Write better notes with AI

Mindli helps you capture, organize, and master any subject with AI-powered summaries and flashcards.