Data Structures: Trees and Binary Search Trees

Trees are one of the most practical ways to represent hierarchical data: a file system, an organization chart, HTML in a browser, and many indexing strategies in databases. Unlike arrays or linked lists, a tree structure captures parent-child relationships directly, making it both expressive and efficient for many real-world tasks.

Binary Search Trees (BSTs) take this idea further by organizing data so that search, insertion, and deletion can be fast, often around $O (lo g n)$ when the tree remains reasonably balanced. Understanding trees and BSTs is foundational for writing efficient software that manipulates structured data at scale.

What a Tree Data Structure Represents

A tree is a connected, acyclic structure made of nodes and edges. Each node stores a value (or key) and references to other nodes.

Key terminology:

Root: the topmost node.
Parent / Child: a node directly above another is its parent; directly below is its child.
Leaf: a node with no children.
Subtree: any node plus all of its descendants.
Depth: distance from the root to a node.
Height: longest path from a node down to a leaf; the height of the tree influences performance.

Many tree types exist, but the most common teaching and interview focus is the binary tree, where each node has at most two children, typically called left and right.

Tree Traversals: Systematic Ways to Visit Nodes

A traversal is an algorithmic pattern for visiting every node in a tree in a specific order. Traversals are not just academic; they determine how you serialize a tree, evaluate expressions, or generate sorted output from a BST.

Depth-First Traversals (DFS)

DFS explores down one branch before backtracking.

Inorder (Left, Node, Right)

In a BST, inorder traversal visits keys in sorted order.
Common use: output elements in ascending sequence.

Preorder (Node, Left, Right)

Visits a node before its children.
Common use: copying a tree, or serializing structure where you need the root first.

Postorder (Left, Right, Node)

Visits children before the node.
Common use: deleting/freeing nodes safely, evaluating expression trees where operands must come before an operator.

Breadth-First Traversal (BFS) or Level Order

BFS visits nodes level by level from the root outward, typically using a queue.

Common use: shortest path in an unweighted tree-like structure, or producing a “by-level” representation of a hierarchy.

Binary Search Trees: Ordered Trees for Efficient Lookup

A Binary Search Tree is a binary tree with a strict ordering rule:

All keys in the left subtree are less than the node’s key.
All keys in the right subtree are greater than the node’s key.
This property holds recursively for every subtree.

Because each comparison allows you to discard about half of the remaining search space, operations can be efficient when the tree height is small.

Why BSTs Can Be $O (lo g n)$

If a BST is balanced, its height is proportional to $lo g_{2} (n)$ . Searching for a key takes one comparison per level, so time is roughly $O (lo g n)$ .

However, if data is inserted in already-sorted order, a naive BST can degrade into a linked list with height $n$ , turning search, insert, and delete into $O (n)$ . This is why balanced trees matter.

Core BST Operations

Search

To find a value:

Compare with the current node.
If equal, you’re done.
If smaller, go left; if larger, go right.
Repeat until found or you hit a null child.

The number of steps equals the height of the tree.

Insert

Insertion follows the same path as search:

Traverse left/right based on comparisons.
Insert the new node at the first null position where it belongs.

Insertion maintains the BST ordering property automatically, but it does not guarantee balance.

Delete

Deletion is the most subtle operation because it must preserve ordering. There are three cases:

Deleting a leaf: remove it directly.
Deleting a node with one child: splice the node out by linking its parent directly to its child.
Deleting a node with two children: replace the node’s key with either:

its inorder successor (smallest key in the right subtree), or
its inorder predecessor (largest key in the left subtree),

then delete that successor/predecessor node, which will fall into case 1 or 2.

BST deletion is a good example of an algorithm that is easy to describe but error-prone in implementation, especially around pointer/reference updates.

Balanced Trees: Keeping BST Performance Predictable

Balanced trees are designed to keep height close to $O (lo g n)$ , preventing the worst-case degeneration. Two widely used self-balancing BST variants are AVL trees and Red-Black trees.

AVL Trees

An AVL tree maintains a strict balance condition: for every node, the heights of left and right subtrees differ by at most 1.

Pros: very tight balance, excellent lookup performance.
Cons: more frequent rebalancing rotations on inserts and deletes.

AVL trees use rotations (local restructuring) to restore balance after updates. Rotations preserve the inorder sequence of keys, so the BST property remains intact.

Red-Black Trees

A Red-Black tree enforces balance through coloring rules that constrain how skewed the tree can get.

Pros: fewer rotations than AVL in many workloads; strong practical performance; common in standard libraries.
Cons: balance is less strict than AVL, so lookups can be slightly slower on average, though still $O (lo g n)$ .

In practice, Red-Black trees are often used for ordered maps/sets because they provide predictable performance with efficient updates.

Choosing Trees and BSTs in Practice

Trees are not always the default best choice; they are best when you need structure or ordering.

Use trees when:

You have inherently hierarchical data (directories, menus, DOM).
You need ordered operations: “give me the next larger key,” range queries, sorted iteration.
You need dynamic updates: frequent inserts and deletes while maintaining sorted order.

A BST (or balanced BST) is often preferable to hashing when:

You need sorted traversal or range queries (for example, keys between 100 and 200).
You need operations like predecessor/successor efficiently.
You want predictable performance without relying on hash quality.

At the same time, if you only need membership tests and have no need for ordering, hash tables are often faster in average-case constant time.

A Practical Mental Model

A tree is a way to model “contains” and “depends on” relationships. A BST adds the idea of “everything smaller goes left; everything larger goes right,” which turns hierarchy into an efficient search structure.

When you hear $O (lo g n)$ for a BST, remember the hidden condition: the tree must stay short. Balanced trees like AVL and Red-Black exist to make that promise reliable, ensuring that search, insert, and delete remain efficient even as data grows and changes over time.

Data Structures: Trees and Binary Search Trees

Data Structures: Trees and Binary Search Trees

What a Tree Data Structure Represents

Tree Traversals: Systematic Ways to Visit Nodes

Depth-First Traversals (DFS)

Inorder (Left, Node, Right)

Preorder (Node, Left, Right)

Postorder (Left, Right, Node)

Breadth-First Traversal (BFS) or Level Order

Binary Search Trees: Ordered Trees for Efficient Lookup

Why BSTs Can Be O(logn)

Core BST Operations

Search

Insert

Delete

Balanced Trees: Keeping BST Performance Predictable

AVL Trees

Red-Black Trees

Choosing Trees and BSTs in Practice

A Practical Mental Model

Write better notes with AI

Why BSTs Can Be $O (lo g n)$