Binary search tree

Binary search tree #

A Binary Search Tree (BST) is another data structure used to store an associative array (e.g. a Java Map).

Definition. A BST is a binary tree such that each node’s label is:

  • larger than its left child’s label (if any), and
  • smaller than its right child’s label (if any).

Equivalently, a BST is a binary tree whose nodes are sorted w.r.t. in-order traversal.

Associative array as a binary search tree #

Recall that an associative array is a finite set of key-value pairs. Again, let us assume keys have type $K$, and that values have type $V$.

When a BST implements an associative array:

  • each node of the tree stores a key-value pair,
  • nodes are sorted by key, using their natural order or any other total order over $K$.

Lookup #

Recall that the lookup operation for an associative array:

  • takes as input a key $k$, and
  • returns the value $v$ associated to this key (if any).

Over a BST, this operation is similar to performing a binary search in a sorted array:

  • If the tree is empty, then it (trivially) does not contain an entry with key $k$.

  • Otherwise, compare $k$ with the label $l$ of the tree’s root:

    • if $k = l$, then return the value stored in the root,
    • if $k < l$, then repeat over the left subtree,
    • if $k > l$, then repeat over the right subtree.

Or in pseudocode:

V lookup(Node root, K key){
    // base case: empty tree
    if(root == null){
        return null;
    }
    // inductive case
    if(root.key == key){
        return root.value;
    }
    if(root.key > key){
        return lookup(root.leftChild);
    }
    return lookup(root.right);
}

If $h$ is the height of the tree, what is the asymptotic worst case running time of this algorithm, expressed as a function of $h$?

$\Theta(h)$

Insert and delete #

Insertion and deletion can also be performed in linear time in the height $h$ of the tree.

Benefits #

Recall that a hash table does not preserve the order of an associative array’s keys (at best, we saw that Java’s LinkedHashMap preserves the order of insertion).

Instead, keys in a BST remain sorted, after any number of insertion or deletion operations. Besides, any total order can be used for this purpose (e.g. alphabetical order if the keys are strings).

Self balancing trees #

A self balancing tree is a specific type of BSTm which guarantees that the tree remains balanced after an insertion or deletion, meaning that its height remains logarithmic in the number of nodes.

Two widely used types of self balancing trees are red-black trees and AVL trees.

in Java #

  • The class java.util.TreeMap implements the interface java.util.Map as a red-black-tree.

  • The class java.util.TreeSet implements the interface java.util.Set as a red-black-tree.

By default, keys are sorted according to their natural order. But both classes provide an alternative constructor that accept an arbitrary Comparator over $K$.