|
1 | 1 | # Trie
|
2 | 2 |
|
3 |
| -##What is a Trie? |
4 |
| -A trie (also known as a prefix tree, or radix tree in some other (but different) implementations) is a special type of tree used to store associative data structures where the key item is normally of type String. Each node in the trie is typically not associated with a value containing strictly itself, but more so is linked to some common prefix that precedes it in levels above it. Oftentimes, true key-value pairs are associated with the leaves of the trie, but they are not limited to this. |
| 3 | +## What is a Trie? |
5 | 4 |
|
6 |
| -##Why a Trie? |
7 |
| -Tries are very useful simply for the fact that it has some advantages over other data structures, like the binary tree or a hash map. These advantages include: |
8 |
| -* Looking up keys is typically faster in the worst case when compared to other data structures. |
9 |
| -* Unlike a hash map, a trie need not worry about key collisions |
10 |
| -* No need for hasing, as each key will have a unique path in the trie |
11 |
| -* Tries, by implementation, can be by default alphabetically ordered. |
| 5 | +A `Trie`, (also known as a prefix tree, or radix tree in some other implementations) is a special type of tree used to store associative data structures. A `Trie` for a dictionary might look like this: |
12 | 6 |
|
| 7 | + |
13 | 8 |
|
14 |
| -##Common Algorithms |
| 9 | +Storing the English language is a primary use case for a `Trie`. Each node in the `Trie` would representing a single character of a word. A series of nodes then make up a word. |
15 | 10 |
|
16 |
| -###Find (or any general lookup function) |
17 |
| -Tries make looking up keys a trivial task, as all one has to do is walk over the nodes until we either hit a null reference or we find the key in question. |
| 11 | +## Why a Trie? |
18 | 12 |
|
19 |
| -The algorithm would be as follows: |
20 |
| -``` |
21 |
| - let node be the root of the trie |
22 |
| - |
23 |
| - for each character in the key |
24 |
| - if the child of node with value character is null |
25 |
| - return false (key doesn't exist in trie) |
26 |
| - else |
27 |
| - node = child of node with value character (move to the next node) |
28 |
| - return true (key exists in trie and was found |
29 |
| -``` |
| 13 | +Tries are very useful for certain situations. Here are some of the advantages: |
30 | 14 |
|
31 |
| -And in swift: |
32 |
| -```swift |
33 |
| -func find(key: String) -> (node: Node?, found: Bool) { |
34 |
| - var currentNode = self.root |
35 |
| - |
36 |
| - for c in key.characters { |
37 |
| - if currentNode.children[String(c)] == nil { |
38 |
| - return(nil, false) |
39 |
| - } |
40 |
| - currentNode = currentNode.children[String(c)]! |
41 |
| - } |
| 15 | +* Looking up values typically have a better worst-case time complexity. |
| 16 | +* Unlike a hash map, a `Trie` does not need to worry about key collisions. |
| 17 | +* Doesn't utilize hashing to guarantee a unique path to elements. |
| 18 | +* `Trie` structures can be alphabetically ordered by default. |
42 | 19 |
|
43 |
| - return(currentNode, currentNode.isValidWord()) |
44 |
| - } |
45 |
| -``` |
| 20 | +## Common Algorithms |
46 | 21 |
|
47 |
| -###Insertion |
48 |
| -Insertion is also a trivial task with a Trie, as all one needs to do is walk over the nodes until we either halt on a node that we must mark as a key, or we reach a point where we need to add extra nodes to represent it. |
| 22 | +### Contains (or any general lookup method) |
49 | 23 |
|
50 |
| -Let's walk through the algorithm: |
| 24 | +`Trie` structures are great for lookup operations. For `Trie` structures that model the English language, finding a particular word is a matter of a few pointer traversals: |
51 | 25 |
|
52 |
| -``` |
53 |
| - let S be the root node of our tree |
54 |
| - let word be the input key |
55 |
| - let length be the length of the key |
56 |
| - |
| 26 | +```swift |
| 27 | +func contains(word: String) -> Bool { |
| 28 | + guard !word.isEmpty else { return false } |
| 29 | + |
| 30 | + // 1 |
| 31 | + var currentNode = root |
57 | 32 |
|
58 |
| - find(word) |
59 |
| - if the word was found |
60 |
| - return false |
61 |
| - else |
62 |
| - |
63 |
| - for each character in word |
64 |
| - if child node with value character does not exist |
65 |
| - break |
66 |
| - else |
67 |
| - node = child node with value character |
68 |
| - decrement length |
69 |
| - |
70 |
| - if length != 0 |
71 |
| - let suffix be the remaining characters in the key defined by the shortened length |
72 |
| - |
73 |
| - for each character in suffix |
74 |
| - create a new node with value character and let it be the child of node |
75 |
| - node = newly created child now |
76 |
| - mark node as a valid key |
77 |
| - else |
78 |
| - mark node as valid key |
| 33 | + // 2 |
| 34 | + var characters = Array(word.lowercased().characters) |
| 35 | + var currentIndex = 0 |
| 36 | + |
| 37 | + // 3 |
| 38 | + while currentIndex < characters.count, |
| 39 | + let child = currentNode.children[character[currentIndex]] { |
| 40 | + |
| 41 | + currentNode = child |
| 42 | + currentIndex += 1 |
| 43 | + } |
| 44 | + |
| 45 | + // 4 |
| 46 | + if currentIndex == characters.count && currentNode.isTerminating { |
| 47 | + return true |
| 48 | + } else { |
| 49 | + return false |
| 50 | + } |
| 51 | +} |
79 | 52 | ```
|
80 | 53 |
|
81 |
| -And the corresponding swift code: |
| 54 | +The `contains` method is fairly straightforward: |
82 | 55 |
|
83 |
| -```swift |
84 |
| - func insert(w: String) -> (word: String, inserted: Bool) { |
85 |
| - |
86 |
| - let word = w.lowercaseString |
87 |
| - var currentNode = self.root |
88 |
| - var length = word.characters.count |
| 56 | +1. Create a reference to the `root`. This reference will allow you to walk down a chain of nodes. |
| 57 | +2. Keep track of the characters of the word you're trying to match. |
| 58 | +3. Walk the pointer down the nodes. |
| 59 | +4. `isTerminating` is a boolean flag for whether or not this node is the end of a word. If this `if` condition is satisfied, it means you are able to find the word in the `trie`. |
89 | 60 |
|
90 |
| - if self.contains(word) { |
91 |
| - return (w, false) |
92 |
| - } |
| 61 | +### Insertion |
93 | 62 |
|
94 |
| - var index = 0 |
95 |
| - var c = Array(word.characters)[index] |
| 63 | +Insertion into a `Trie` requires you to walk over the nodes until you either halt on a node that must be marked as `terminating`, or reach a point where you need to add extra nodes. |
96 | 64 |
|
97 |
| - while let child = currentNode.children[String(c)] { |
98 |
| - currentNode = child |
99 |
| - length -= 1 |
100 |
| - index += 1 |
| 65 | +```swift |
| 66 | +func insert(word: String) { |
| 67 | + guard !word.isEmpty else { return } |
101 | 68 |
|
102 |
| - if(length == 0) { |
103 |
| - currentNode.isWord() |
104 |
| - wordList.append(w) |
105 |
| - wordCount += 1 |
106 |
| - return (w, true) |
107 |
| - } |
| 69 | + // 1 |
| 70 | + var currentNode = root |
| 71 | + |
| 72 | + // 2 |
| 73 | + var characters = Array(word.lowercased().characters) |
| 74 | + var currentIndex = 0 |
| 75 | + |
| 76 | + // 3 |
| 77 | + while currentIndex < characters.count { |
| 78 | + let character = characters[currentIndex] |
108 | 79 |
|
109 |
| - c = Array(word.characters)[index] |
| 80 | + // 4 |
| 81 | + if let child = currentNode.children[character] { |
| 82 | + currentNode = child |
| 83 | + } else { |
| 84 | + currentNode.add(child: character) |
| 85 | + currentNode = currentNode.children[character]! |
110 | 86 | }
|
| 87 | + |
| 88 | + currentIndex += 1 |
111 | 89 |
|
112 |
| - let remainingChars = String(word.characters.suffix(length)) |
113 |
| - for c in remainingChars.characters { |
114 |
| - currentNode.children[String(c)] = Node(c: String(c), p: currentNode) |
115 |
| - currentNode = currentNode.children[String(c)]! |
| 90 | + // 5 |
| 91 | + if currentIndex == characters.count { |
| 92 | + currentNode.isTerminating = true |
116 | 93 | }
|
117 |
| - |
118 |
| - currentNode.isWord() |
119 |
| - wordList.append(w) |
120 |
| - wordCount += 1 |
121 |
| - return (w, true) |
122 | 94 | }
|
123 |
| - |
| 95 | +} |
124 | 96 | ```
|
125 | 97 |
|
126 |
| -###Removal |
127 |
| -Removing keys from the trie is a little more tricky, as there a few more cases that we have to take into account the fact that keys may exist that are actually sub-strings of other valid keys. That being said, it isn't as simple a process to just delete the nodes for a specific key, as we could be deleting references/nodes necessary for already exisitng keys! |
| 98 | +1. Once again, you create a reference to the root node. You'll move this reference down a chain of nodes. |
| 99 | +2. Keep track of the word you want to insert. |
| 100 | +3. Begin walking through your word letter by letter |
| 101 | +4. Sometimes, the required node to insert already exists. That is the case for two words inside the `Trie` that shares letters (i.e "Apple", "App"). If a letter already exists, you'll reuse it, and simply traverse deeper down the chain. Otherwise, you'll create a new node representing the letter. |
| 102 | +5. Once you get to the end, you mark `isTerminating` to true to mark that specific node as the end of a word. |
128 | 103 |
|
129 |
| -The algorithm would be as follows: |
130 |
| - |
131 |
| -``` |
132 |
| - |
133 |
| - let word be the key to remove |
134 |
| - let node be the root of the trie |
135 |
| - |
136 |
| - find(word) |
137 |
| - if word was not found |
138 |
| - return false |
139 |
| - else |
140 |
| - |
141 |
| - for each character in word |
142 |
| - node = child node with value character |
143 |
| - |
144 |
| - if node has more than just 1 child node |
145 |
| - Mark node as an invalid key, since removing it would remove nodes still in use |
146 |
| - else |
147 |
| - while node has no valid children and node is not the root node |
148 |
| - let character = node's value |
149 |
| - node = the parent of node |
150 |
| - delete node's child node with value character |
151 |
| - return true |
152 |
| -``` |
| 104 | +### Removal |
153 | 105 |
|
| 106 | +Removing keys from the trie is a little tricky, as there are a few more cases you'll need to take into account. Nodes in a `Trie` may be shared between different words. Consider the two words "Apple" and "App". Inside a `Trie`, the chain of nodes representing "App" is shared with "Apple". |
154 | 107 |
|
155 |
| - |
156 |
| -and the corresponding swift code: |
| 108 | +If you'd like to remove "Apple", you'll need to take care to leave the "App" chain in tact. |
157 | 109 |
|
158 | 110 | ```swift
|
159 |
| - func remove(w: String) -> (word: String, removed: Bool){ |
160 |
| - let word = w.lowercaseString |
161 |
| - |
162 |
| - if(!self.contains(w)) { |
163 |
| - return (w, false) |
164 |
| - } |
165 |
| - var currentNode = self.root |
| 111 | +func remove(word: String) { |
| 112 | + guard !word.isEmpty else { return } |
166 | 113 |
|
167 |
| - for c in word.characters { |
168 |
| - currentNode = currentNode.getChildAt(String(c)) |
169 |
| - } |
170 |
| - |
171 |
| - if currentNode.numChildren() > 0 { |
172 |
| - currentNode.isNotWord() |
173 |
| - } else { |
174 |
| - var character = currentNode.char() |
175 |
| - while(currentNode.numChildren() == 0 && !currentNode.isRoot()) { |
176 |
| - currentNode = currentNode.getParent() |
177 |
| - currentNode.children[character]!.setParent(nil) |
178 |
| - currentNode.children[character]!.update(nil) |
179 |
| - currentNode.children[character] = nil |
180 |
| - character = currentNode.char() |
181 |
| - } |
182 |
| - } |
183 |
| - |
184 |
| - wordCount -= 1 |
185 |
| - |
186 |
| - var index = 0 |
187 |
| - for item in wordList{ |
188 |
| - if item == w { |
189 |
| - wordList.removeAtIndex(index) |
190 |
| - } |
191 |
| - index += 1 |
| 114 | + // 1 |
| 115 | + var currentNode = root |
| 116 | + |
| 117 | + // 2 |
| 118 | + var characters = Array(word.lowercased().characters) |
| 119 | + var currentIndex = 0 |
| 120 | + |
| 121 | + // 3 |
| 122 | + while currentIndex < characters.count { |
| 123 | + let character = characters[currentIndex] |
| 124 | + guard let child = currentNode.children[character] else { return } |
| 125 | + currentNode = child |
| 126 | + currentIndex += 1 |
| 127 | + } |
| 128 | + |
| 129 | + // 4 |
| 130 | + if currentNode.children.count > 0 { |
| 131 | + currentNode.isTerminating = false |
| 132 | + } else { |
| 133 | + var character = currentNode.value |
| 134 | + while currentNode.children.count == 0, let parent = currentNode.parent, !parent.isTerminating { |
| 135 | + currentNode = parent |
| 136 | + currentNode.children[character!] = nil |
| 137 | + character = currentNode.value |
192 | 138 | }
|
193 |
| - |
194 |
| - return (w, true) |
195 | 139 | }
|
196 |
| - |
| 140 | +} |
197 | 141 | ```
|
198 | 142 |
|
| 143 | +1. Once again, you create a reference to the root node. |
| 144 | +2. Keep track of the word you want to remove. |
| 145 | +3. Attempt to walk to the terminating node of the word. The `guard` statement will return if it can't find one of the letters; It's possible to call `remove` on a non-existant entry. |
| 146 | +4. If you reach the node representing the last letter of the word you want to remove, you'll have 2 cases to deal with. Either it's a leaf node, or it has more children. If it has more children, it means the node is used for other words. In that case, you'll just mark `isTerminating` to false. In the other case, you'll delete the nodes. |
199 | 147 |
|
200 |
| -###Running Times |
| 148 | +### Time Complexity |
201 | 149 |
|
202 |
| -Let n be the length of some key in the trie |
| 150 | +Let n be the length of some value in the `Trie`. |
203 | 151 |
|
204 |
| -* Find(...) : In the Worst case O(n) |
205 |
| -* Insert(...) : O(n) |
206 |
| -* Remove(...) : O(n) |
| 152 | +* `contains` - Worst case O(n) |
| 153 | +* `insert` - O(n) |
| 154 | +* `remove` - O(n) |
207 | 155 |
|
208 |
| -###Other Notable Operations |
| 156 | +### Other Notable Operations |
209 | 157 |
|
210 |
| -* Count: Returns the number of keys in the trie ( O(1) ) |
211 |
| -* getWords: Returns a list containing all keys in the trie ( *O(1) ) |
212 |
| -* isEmpty: Returns true f the trie is empty, false otherwise ( *O(1) ) |
213 |
| -* contains: Returns true if the trie has a given key, false otherwise ( O(n) ) |
214 |
| - |
215 |
| -`* denotes that running time may vary depending on implementation |
| 158 | +* `count`: Returns the number of keys in the `Trie` - O(1) |
| 159 | +* `words`: Returns a list containing all the keys in the `Trie` - O(1) |
| 160 | +* `isEmpty`: Returns `true` if the `Trie` is empty, `false` otherwise - O(1) |
216 | 161 |
|
217 | 162 | See also [Wikipedia entry for Trie](https://en.wikipedia.org/wiki/Trie).
|
218 | 163 |
|
219 |
| -*Written for the Swift Algorithm Club by Christian Encarnacion* |
220 |
| - |
| 164 | +*Written for the Swift Algorithm Club by Christian Encarnacion. Refactored by Kelvin Lau* |
0 commit comments