Math AI HL: Algorithms and Optimisation
AI-Generated Content
Math AI HL: Algorithms and Optimisation
In a world driven by logistics, network design, and efficient routing, the ability to find optimal or near-optimal solutions to complex problems is a superpower. Mastering key graph theory algorithms from the IB Math AI HL syllabus enables you to apply specific, step-by-step methods to solve real-world optimisation problems, understanding not just the how but also the why behind their effectiveness and limitations.
The Travelling Salesman Problem and the Nearest Neighbour Algorithm
The Travelling Salesman Problem (TSP) is a classic optimisation challenge: given a list of cities and the distances between each pair, find the shortest possible route that visits each city exactly once and returns to the origin city. For a small number of vertices (cities), you could check all possible routes, but this brute-force method becomes computationally impossible as the number of cities grows. This is where heuristic algorithms, which provide good but not necessarily perfect solutions, become essential.
The nearest neighbour algorithm is a simple, intuitive heuristic for finding an approximate solution to the TSP. It is a greedy algorithm, meaning it makes the locally optimal choice at each step. You start at a chosen starting vertex. From your current vertex, you travel to the nearest unvisited vertex. You repeat this process until all vertices have been visited, then finally return to the starting vertex.
Worked Example: Consider a complete graph with four vertices (A, B, C, D) and the following weighted adjacency matrix (distances):
Using the nearest neighbour algorithm starting at A:
- Start at A. The nearest unvisited vertex is B (distance 10). Go to B.
- At B. Unvisited: C, D. Nearest is D (distance 25). Go to D.
- At D. Only unvisited is C (distance 30). Go to C.
- All vertices visited. Return from C to A (distance 15).
The cycle is A–B–D–C–A. Its total weight is . Is this the absolute shortest route? You would need to check all others to be sure, but for large problems, this algorithm provides a quick, reasonable answer. A critical limitation is that its result depends on the starting vertex, and it can miss globally optimal routes by making short-sighted early choices.
Finding Minimum Spanning Trees: Kruskal's and Prim's Algorithms
A tree is a connected graph with no cycles. A spanning tree of a connected graph is a subgraph that includes all the vertices of the original graph and is a tree. A minimum spanning tree (MST) is a spanning tree with the smallest possible total edge weight. MSTs are vital for designing cost-efficient networks, like connecting houses to utilities or linking computer networks.
Two primary algorithms find an MST: Kruskal's algorithm and Prim's algorithm. Both are greedy, but they approach the problem differently.
Kruskal's Algorithm works by building the MST one edge at a time, always choosing the smallest available edge that does not create a cycle.
- Sort all edges in the graph in ascending order of weight.
- Begin with a forest where each vertex is its own separate tree.
- Add the smallest edge to the forest. If it connects two different trees, it is included. If it connects two vertices already in the same tree (creating a cycle), reject it.
- Repeat step 3 until you have added edges, where is the number of vertices.
Prim's Algorithm builds the MST from a starting vertex, growing a single tree.
- Choose any starting vertex. This is your initial tree.
- From all edges that connect a vertex in your tree to a vertex not yet in the tree, select the one with the smallest weight.
- Add this edge and its new vertex to the tree.
- Repeat step 2 until all vertices are in the tree.
Comparison: Kruskal's algorithm considers edges globally, while Prim's grows from a local root. For a graph with edges and vertices, their efficiency is often discussed in terms of their time complexity, which for good implementations is . Both will always yield a correct MST for a connected, weighted graph. In an exam, you must be able to execute both algorithms step-by-step on a given graph, clearly showing your selection order and rejecting edges that form cycles (for Kruskal).
Solving the Chinese Postman Problem
While the TSP requires visiting each vertex once, the Chinese Postman Problem (CPP) requires traversing every edge in a graph at least once and returning to the start, with the goal of minimizing the total distance travelled. It models tasks like street sweeping, postal delivery, or garbage collection where every road (edge) must be covered.
The solution strategy depends on the graph's vertices. The degree of a vertex is the number of edges incident to it.
- If all vertices have even degree, the graph is Eulerian. An Eulerian circuit (a cycle traversing every edge exactly once) exists and is the optimal route. The postman can simply follow this circuit.
- If some vertices have odd degree, you have a semi-Eulerian or non-Eulerian graph. The key insight is that in any graph, the number of odd-degree vertices is always even. The solution involves pairing up these odd vertices and adding a duplicate of the edges along the shortest path between each pair. This effectively makes all vertices even-degree, allowing you to find an Eulerian circuit on the augmented graph.
Algorithm for the CPP:
- Identify all vertices of odd degree.
- Consider all possible pairings of these odd vertices.
- For each pairing, find the sum of the weights of the shortest paths between the paired vertices.
- Choose the pairing with the minimum total weight.
- Add a duplicate edge along each shortest path in this optimal pairing.
- The augmented graph now has all even-degree vertices. Find an Eulerian circuit for this new graph—this is the optimal postman route.
The total route length will be the sum of all original edge weights plus the total weight of the duplicated paths from step 4.
Comparing Algorithm Efficiency and Optimality
A crucial part of your analysis is understanding the trade-offs between different algorithms. Optimality refers to whether an algorithm guarantees the absolute best solution. Efficiency (or complexity) refers to the computational resources (like time) it requires as the problem size grows.
- Nearest Neighbour (for TSP): It is not optimal. It is a heuristic that provides an approximate solution quickly. Its time complexity is roughly for vertices, which is efficient but sacrifices guaranteed correctness for speed. It is useful for large problems where an exact solution is infeasible.
- Kruskal's & Prim's (for MST): Both are optimal—they always find a true minimum spanning tree. They are also efficient, with implementations running in time. For dense graphs (many edges), Prim's can be slightly more efficient; for sparse graphs, they are comparable.
- Chinese Postman Algorithm: The core of finding the shortest path pairing for odd vertices can be computationally challenging for many pairs, but for the scale of problems in IB, it's manageable. The resulting route for the augmented graph is optimal for traversing all edges.
In summary, you choose an algorithm based on the problem constraint (visit vertices vs. edges) and the trade-off you are willing to make between computational time and solution quality.
Common Pitfalls
- Confusing TSP and CPP Objectives: The most frequent conceptual error is mixing up which problem requires visiting all vertices (TSP) and which requires traversing all edges (CPP). Remember: Salesmen visit cities (vertices), postmen walk down streets (edges).
- Incorrect Cycle Checking in Kruskal's Algorithm: When adding an edge, you must check if its endpoints are already connected in the current forest. Students often incorrectly reject an edge only if it directly forms a small cycle, missing larger connected components. Methodically track which vertices are in which growing tree.
- Forgetting to Return to Start in TSP: The TSP cycle must return to the starting city. A common mistake when applying the nearest neighbour algorithm is to stop after visiting the last new vertex, forgetting to add the weight of the edge that returns to the start.
- Mishandling Odd Vertices in CPP: When solving the CPP, you only add duplicate edges along the shortest path between paired odd vertices, not necessarily the direct edge (if one even exists). Also, after adding duplicates, you must work with the augmented graph to find the Eulerian circuit, not the original one.
Summary
- The nearest neighbour algorithm is a greedy heuristic that provides an approximate, non-optimal solution to the Travelling Salesman Problem (TSP) by always moving to the closest unvisited city.
- Kruskal's and Prim's algorithms are both optimal, greedy methods for finding a minimum spanning tree (MST). Kruskal's builds by adding the smallest safe edge globally, while Prim's grows a single tree from a starting vertex.
- The Chinese Postman Problem (CPP) is solved by identifying vertices of odd degree, pairing them optimally via shortest paths, and duplicating those path edges to create an Eulerian graph, which can then be traversed completely.
- Algorithm optimality (guaranteeing the best answer) and efficiency (time/resources required) are key trade-offs. MST algorithms are optimal and efficient; the nearest neighbour heuristic is efficient but not optimal.
- Always double-check the fundamental objective of the problem (vertex vs. edge coverage) and meticulously follow each algorithm's step-by-step rules to avoid procedural errors.