thumb|right|251px|An example of an unbalanced tree; following the path from the root to a node takes an average of 3.27 node accesses
thumb|right|251px|The same tree after being height-balanced; the average path effort decreased to 3.00 node accesses
In computer science, a self-balancing binary search tree (BST) is any node-based binary search tree that automatically keeps its height (maximal number of levels below the root) small in the face of arbitrary item insertions and deletions.
These operations when designed for a self-balancing binary search tree, contain precautionary measures against boundlessly increasing tree height, so that these abstract data structures receive the attribute "self-balancing".
For height-balanced binary trees, the height is defined to be logarithmic <math>O(\log n)</math> in the number <math>n</math> of items. This is the case for many binary search trees, such as AVL trees and red–black trees. Splay trees and treaps are self-balancing but not height-balanced, as their height is not guaranteed to be logarithmic in the number of items.
Self-balancing binary search trees provide efficient implementations for mutable ordered lists, and can be used for other abstract data structures such as associative arrays, priority queues and sets.
Overview
thumb|300px|Tree rotations are very common internal operations on self-balancing binary trees to keep perfect or near-to-perfect balance.
Most operations on a binary search tree (BST) take time directly proportional to the height of the tree, so it is desirable to keep the height small. A binary tree with height h can contain at most 2<sup>0</sup>+2<sup>1</sup>+···+2<sup>h</sup> = 2<sup>h+1</sup>−1 nodes. It follows that for any tree with n nodes and height h:
:<math>n\le 2^{h+1}-1</math>
And that implies:
:<math>h\ge\lceil\log_2(n+1)-1\rceil\ge \lfloor\log_2 n\rfloor</math>.
In other words, the minimum height of a binary tree with n nodes is rounded down; that is, <math> \lfloor \log_2 n\rfloor</math>. hash tables (<math>O(\log n)</math> compared to <math>O(n)</math>), but have worse average-case performance (<math>O(\log n)</math> compared to <math>O(1)</math>).
Self-balancing BSTs can be used to implement any algorithm that requires mutable ordered lists, to achieve optimal worst-case asymptotic performance. For example, if binary tree sort is implemented with a self-balancing BST, we have a very simple-to-describe yet asymptotically optimal <math>O(n \log n)</math> sorting algorithm. Similarly, many algorithms in computational geometry exploit variations on self-balancing BSTs to solve problems such as the line segment intersection problem and the point location problem efficiently. (For average-case performance, however, self-balancing BSTs may be less efficient than other solutions. Binary tree sort, in particular, is likely to be slower than merge sort, quicksort, or heapsort, because of the tree-balancing overhead as well as cache access patterns.)
Self-balancing BSTs are flexible data structures, in that it is easy to extend them to efficiently record additional information or perform new operations. For example, one can record the number of nodes in each subtree having a certain property, allowing one to count the number of nodes in a certain key range with that property in <math>O(\log n)</math> time. These extensions can be used, for example, to optimize database queries or other list-processing algorithms.
See also
- Search data structure
- Day–Stout–Warren algorithm
- Fusion tree
- Skip list
- Sorting
References
External links
- Dictionary of Algorithms and Data Structures: Height-balanced binary search tree
- GNU libavl, a LGPL-licensed library of binary tree implementations in C, with documentation
