Trie (pronounced "try") or prefix tree is a data structure used for retrieval of a key in a dataset of strings.
Google autocomplete suggest in action.
// Spell checker used in word processor.
Longest prefix matching algorithm uses Tries in Internet Protocol (IP) routing to select an entry from a forwarding table.
T9, which stands for Text on 9 keys, was used on phones to input texts during the late 1990s.
Tries is used to solve word games like Boggle efficiently by pruning the search space.
There are data structures, like balanced trees and hash tables, which give us the possibility to search for a word in a dataset of strings. Although hash table has
- Finding all keys with a common prefix.
- Enumerating a dataset of strings in lexicographical order.
Moreover, hash tables increase in size, there are lots of hash collisions and the search time complexity could deteriorate to
In this case, using a trie has only
Trie is a rooted tree and its nodes have the following fields:
- Maximum of
$R$ links to its children, where each link corresponds to one of$R$ character values from a dataset alphabet (e.g., 26 letters of the lowercase English alphabet). - Boolean field which specifies whether the node corresponds to the end of the key, or is just a key prefix.
Representation of a key "leet" in trie.
We insert a key by searching into the trie. We start from the root and search a link corresponding to the first key character.
- If the link exists, then we move down following the link and continue searching for the next key character.
- Else, we create a new node and link it with the parent node, matching the current key character. Repeat this step until we reach the end of the key and mark it as the end node.
Insertion of keys into a trie.
$m$ is the key length.
- Time complexity:
$O(m)$
In each iteration, we either examine or create a node in the trie till we reach the end of the key.
- Space complexity:
$O(m)$
In the worst case, newly inserted key doesn't share a prefix with the keys already inserted and we have to add
Each key is represented as a path from the root to an internal node or leaf. We start from the root and examine for a link corresponding to the key character.
- If the link exists, we move down following the link and continue searching for the next key character.
- If the link doesn't exist, and we are at the end of the key, and the node is marked as the end, then the key is present in the trie.
- Else, the key is not present in the trie.
Search for "leet" in trie.
- Time complexity:
$O(m)$ . - Space complexity:
$O(1)$ .