Skip to content

Commit e2d8e39

Browse files
Updated via commit to -Solution by beejjorgensen
1 parent c643b67 commit e2d8e39

28 files changed

+1721
-1
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
__pycache__
2+
.vscode

README.md

Lines changed: 81 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,81 @@
1-
# cs-module-project-hash-tables
1+
# Hash Tables
2+
3+
## Day 1
4+
5+
Task: Implement a basic hash table without collision resolution.
6+
7+
1. Implement a `HashTable` class and `HashTableEntry` class.
8+
9+
2. Implement a good hashing function.
10+
11+
Recommend either of:
12+
13+
* DJB2
14+
* FNV-1 (64-bit)
15+
16+
3. Implement the `hash_index()` that returns an index value for a key.
17+
18+
4. Implement the `put()`, `get()`, and `delete()` methods.
19+
20+
You can test this with:
21+
22+
```
23+
python test_hashtable_no_collisions.py
24+
```
25+
26+
The above test program is _unlikely_ to have collisions, but it's
27+
certainly possible for various hashing functions. With DJB2 (32 bit) and
28+
FNV-1 (64 bit) hashing functions, there are no collisions.
29+
30+
## Day 2
31+
32+
Task: Implement linked-list chaining for collision resolution.
33+
34+
1. Modify `put()`, `get()`, and `delete()` methods to handle collisions.
35+
36+
2. There is no step 2.
37+
38+
You can test this with:
39+
40+
```
41+
python test_hashtable.py
42+
```
43+
44+
Task: Implement load factor measurements and automatic hashtable size
45+
doubling.
46+
47+
1. Compute and maintain load factor.
48+
49+
2. When load factor increases above `0.7`, automatically rehash the
50+
table to double its previous size.
51+
52+
Add the `resize()` method.
53+
54+
You can test this with both of:
55+
56+
```
57+
python test_hashtable.py
58+
python test_hashtable_resize.py
59+
```
60+
61+
Stretch: When load factor decreases below `0.2`, automatically rehash
62+
the table to half its previous size, down to a minimum of 8 slots.
63+
64+
## Day 3 and Day 4
65+
66+
Work on the hashtable applications directory (in any order you
67+
wish--generally arranged from easier to harder, below).
68+
69+
For these, you can use either the built-in `dict` type, or the hashtable
70+
you built. (Some of these are easier with `dict` since it's more
71+
full-featured.)
72+
73+
* [Lookup Table](applications/lookup_table/)
74+
* [Expensive Sequence](applications/expensive_seq/)
75+
* [Word Count](applications/word_count/)
76+
* [No Duplicates](applications/no_dups/)
77+
* [Markov Chains](applications/markov/)
78+
* [Histogram](applications/histo/)
79+
* [Cracking Caesar Ciphers](applications/crack_caesar/)
80+
* [Sum and Difference](applications/sumdiff/)
81+

applications/crack_caesar/README.md

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# Cracking a Caesar Cipher
2+
3+
You're going to use _frequency analysis_ to crack a Caesar cipher to
4+
recover the key and the plaintext.
5+
6+
## Caesar Ciphers
7+
8+
These are methods of encryption where you take the _plaintext_ (the
9+
unencrypted text) and encrypt it by substituting one letter for another
10+
to produce the _ciphertext_ (the encrypted text).
11+
12+
For example, we might have the following mapping (which is the _key_ for
13+
unlocking this cipher, not to be confused with a hash table key):
14+
15+
```
16+
A -> H B -> Z C -> Y D -> W E -> O
17+
F -> R G -> J H -> D I -> P J -> T
18+
K -> I L -> G M -> L N -> C O -> E
19+
P -> X Q -> K R -> U S -> N T -> F
20+
U -> A V -> M W -> B X -> Q Y -> V
21+
Z -> S
22+
```
23+
24+
So if you have plaintext like `HELLO, WORLD!`, use the above table and
25+
`H` becomes `D`, `E` becomes `O`, and so on to produce ciphertext
26+
`DOGGE, BEUGW!`
27+
28+
To decode, just do the reverse, `D` becomes `H`, etc.
29+
30+
But what if you evesdrop on some ciphertext, but don't know the key (the
31+
mapping). How can you decode it?
32+
33+
## Frequency Analysis
34+
35+
Turns out, letters occur in the English language with a known frequency.
36+
The letter `A` is 8.46% of all letters, for example.
37+
38+
(Disclaimer: these are not the actual frequencies in general english
39+
prose. They're contrived for this specific challenge so that you get a
40+
decent result, but they're quite close to the real percentages.)
41+
42+
| Letter | Percentage |
43+
|:------:|-----------:|
44+
| E | 11.53 |
45+
| T | 9.75 |
46+
| A | 8.46 |
47+
| O | 8.08 |
48+
| H | 7.71 |
49+
| N | 6.73 |
50+
| R | 6.29 |
51+
| I | 5.84 |
52+
| S | 5.56 |
53+
| D | 4.74 |
54+
| L | 3.92 |
55+
| W | 3.08 |
56+
| U | 2.59 |
57+
| G | 2.48 |
58+
| F | 2.42 |
59+
| B | 2.19 |
60+
| M | 2.18 |
61+
| Y | 2.02 |
62+
| C | 1.58 |
63+
| P | 1.08 |
64+
| K | 0.84 |
65+
| V | 0.59 |
66+
| Q | 0.17 |
67+
| J | 0.07 |
68+
| X | 0.07 |
69+
| Z | 0.03 |
70+
71+
In other words, ordered from most frequently used to least, the letters
72+
are:
73+
74+
```
75+
'E', 'T', 'A', 'O', 'H', 'N', 'R', 'I', 'S', 'D', 'L', 'W', 'U',
76+
'G', 'F', 'B', 'M', 'Y', 'C', 'P', 'K', 'V', 'Q', 'J', 'X', 'Z'
77+
```
78+
79+
`E` is the most frequent letter. `Z` is the least frequent. And `M` is
80+
somewhere in the middle.
81+
82+
So if you have a large enough block of ciphertext, you can analyze the
83+
frequency of letters in there. And if `X` is the most frequent, then
84+
it's a safe bet that the key includes this mapping:
85+
86+
```
87+
E -> X
88+
```
89+
90+
## Challenge
91+
92+
Write a program that automatically finds the key for the ciphertext in
93+
the file [`ciphertext.txt`](ciphertext.txt), then decodes it and shows
94+
the plaintext.
95+
96+
(All non-letters should pass through the decoding as-is, i.e. spaces and
97+
punctuation should be preserved. The input will not contain any
98+
lowercase letters.)
99+
100+
No tests are provided for this one, but the result should be readable,
101+
with at most a handful of incorrect letters.

0 commit comments

Comments
 (0)