Understanding Trie
A trie, also called a prefix tree, is a type of search tree used to efficiently store and retrieve strings from a dictionary or a set. Unlike standard data structures, tries excel at operations where prefixes matter, making them ideal for tasks like autocomplete, spell checking, and even IP routing.
Why use a trie over a hashmap?
- Tries naturally handle multiple keys with shared prefixes, avoiding hash collisions.
- They allow fast retrieval of all keys that start with a given prefix, which hashmaps cannot do efficiently.
Challenges and optimizations
- Basic trie implementations can be memory-intensive, especially for large alphabets.
- To improve efficiency, techniques such as compression and bitwise representations are often used.
- A special type of trie, called a radix tree, further reduces memory usage by compactly storing common prefixes.
Beyond strings
- Tries aren’t limited to characters — they can store any ordered sequence, such as permutations of digits, shapes, or other elements, making them highly versatile in computational tasks.
Implementing Trie from scratch in c++
Initializing
- first step is to create the TrieNode itself. The root node value is just an empty string and there should be 26 children. One for each character in the alphabet.
- I’ll also add a flag called isEnd so that when we are at the end or last character we just stop and don’t traverse down anymore.
struct TrieNode {
bool isEnd;
TrieNode* children[26];
TrieNode() {
isEnd = false;
for (int i = 0; i < 26; i++) {
children[i] = nullptr;
}
}
};
- To define the TrieNode I am just going to use a struct. The difference between a struct and a class in c++ is that struct is public by default while the class is private by default.
Insertion
- to insert we just need to go to the respective child element from the root node, create a new node at the child node and keep repeating the process until we hit end of the word.
void insert(string word) {
TrieNode* node = root;
for (char w : word) {
int idx = w - 'a';
if (node->children[idx] == nullptr) {
node->children[idx] = new TrieNode();
}
node = node->children[idx];
}
node->isEnd = true;
}
Searching
- The searching process is quite similar as well. Instead of adding new nodes, we just check if we are able to traverse down to the isEnd flag, if we are then the word exists, otherwise it doesn’t and we return false.
bool search(string word) {
TrieNode* node = root;
for (char w: word){
int idx = w - 'a';
if (node->children[idx] == nullptr){
return false;
}
node = node->children[idx];
}
return node->isEnd;
}
- there’s also searching for the prefix but it works on the same idea so I wont go into the implementation details
Deletion
- This one is more tricky to implement as you have to carefully remove nodes without breaking other words.
- For eg - If you want to delete
"car", you cannot delete the'c'or'a'nodes, because"cat"and"care"still use them. - For that reason we can only delete leaf nodes in the trie. After removing the leaf node we just need to set the
isEnd = false
bool deleteWord(TrieNode* node, string word, int depth = 0) {
if (!node) return false;
// If we reached the end of the word
if (depth == word.size()) {
if (node->isEnd) node->isEnd = false;
// If no children, this node can be deleted
for (int i = 0; i < 26; i++)
if (node->children[i] != nullptr) return false;
return true;
}
int idx = word[depth] - 'a';
if (deleteWord(node->children[idx], word, depth + 1)) {
delete node->children[idx];
node->children[idx] = nullptr;
// If current node is not end of another word and has no other children
if (!node->isEnd) {
for (int i = 0; i < 26; i++)
if (node->children[i] != nullptr) return false;
return true;
}
}
return false;
}