Skip to content

Commit 0fad9f1

Browse files
Merge pull request #36 from paxtonfitzpatrick/main
catching up on previous days' problems
2 parents ac59ffc + dfb56b1 commit 0fad9f1

File tree

4 files changed

+301
-0
lines changed

4 files changed

+301
-0
lines changed

problems/1334/paxtonfitzpatrick.md

+65
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# [Problem 1334: Find the City With the Smallest Number of Neighbors at a Threshold Distance](https://leetcode.com/problems/find-the-city-with-the-smallest-number-of-neighbors-at-a-threshold-distance/description/?envType=daily-question)
2+
3+
## Initial thoughts (stream-of-consciousness)
4+
5+
- I feel like this is one of those problems that's supposed to cue you to some particular algorithm that I'm just not familiar with. I'll probably end up googling for this eventually, but I wanna take a stab at a solution first.
6+
- okay so to answer the problem I need to find the number of other nodes within `distanceThreshold` of each node. My first thought is I could do a DFS (BFS?) from each node where I stop going down a particular path when I reach the threshold distance, but that seems like it'd be slow...
7+
- I could reduce the overall number of paths upfront by dropping any whose weights are greater than the threshold distance, because I know they'll never contribute to a viable path anyway. This would require an additional $O(n)$ operation to filter the initial edge list, but if `distanceThreshold` is low, it could easily pay off. There are even a few instances of this in the examples, so I feel like it's something we're meant to notice.
8+
- actually, I'll need to iterate through the list of edges at least once to build the graph anyway, so I could just filter out the edges that are too long as I go. So this wouldn't add a whole extra $O(n)$ operation and is definitely worth doing, I think
9+
- hmmm... another scenario potentially worth accounting for: if a node has no edges to any other nodes, then I know that 0 is the smallest number of neighbors. I'm not even sure whether they'd include a case like this, but it feels conspicuous that we're given `n` in addition to the `edges` list -- if all `n` nodes were necessarily in `edges`, `n` would be redundant.
10+
- let's switch over to a different part of the problem -- I'll need some way of representing the graph to traverse it. I think this format will likely follow the traversal method I come up with, but my initial thought is to create a dict where keys are the IDs for each node and their values are a lists of (neighbor, edge weight) tuples.
11+
- maybe the DFS approach would be manageable with some sort of caching or memoization? E.g., I could do a normal DFS for the first node I search from (up to `distanceThreshold`), and then store distances from that node to all nodes within `distanceThreshold` of it. Then whenever I DFS from another node and encounter that one, I can check that record instead of going further down that path.
12+
- actually I'm not sure this would end up being faster... I'll end up checking more nodes than I would have if I'd just done a normal DFS because some within `distanceThreshold` of the node whose record I check won't be within `distanceThreshold` of the node I'm searching from. Might've helped if this were a binary tree, but since it's a graph I'll also end up checking lots of nodes via both DFS and other nodes' records
13+
- then again, if some node is within `distanceThreshold` of the node I'm currently DFSing from via the node whose record I'm checking and via some other route, I'd have ended up checking it twice anyway via 2 DFSes instead of 1 DFS + 1 record check... so maybe the savings of being able to stop the DFS for a certain path when I hit a node with record data would outweigh the extra checks I'd do because of it? I'm not sure...
14+
- maybe there's a more efficient way to take advantage of work already done? This idea of considering nodes as "waypoints" to compare different paths seems promising -- i.e., "is the path between node `a` and node `b` via node `c` shorter than the path between them via node `d`/the shortest path between them I've found so far?" But to get to a point where I could check that I'd have to already know the distance between node `d` and each other node... which is the problem I'm trying to solve in the first place. So I'm not sure how to get there.
15+
- what if I represented the minimum distance between each pair of nodes in a 2D array (upper and lower triangles would be duplicates... maybe I can optimize that away?). Then for each node in the graph `i`, for each of its neighbors `j`, for each node in the graph `k`, I could check whether the path from `i` to `k` via `j` is shorter than `distances_matrix[i][k]`, and update it if so.
16+
- hmmm though I'm not sure whether this would work for graphs with nodes separated by > 2 degrees... that doesn't show up in any of the example graphs.
17+
- okay I ended up googling around and it turns out this is basically the "Floyd-Warshall algorithm"! I just needed to make `j` all nodes in the graph instead of just the neighbors of `i`. I'll try implementing that now... though since the algorithm has $O(n^3)$ time complexity, I'm still not sure whether this would actually be more efficient than the BFS/DFS approach...
18+
19+
20+
## Refining the problem, round 2 thoughts
21+
22+
Given the way the Floyd-Warshall algorithm works, I don't think it'll be worth removing edges > `distanceThreshold` from the initial `edges` list because it wouldn't save us any checks during the main processing of the pairwise distances. However, since the algorithm takes $O(n^3)$ time, I *do* think it'll be worth checking upfront whether any nodes have no edges to any other nodes, because that would take comparatively little time, I think
23+
24+
## Attempted solution(s)
25+
26+
```python
27+
class Solution:
28+
def findTheCity(self, n: int, edges: List[List[int]], distanceThreshold: int) -> int:
29+
# initialize min distances matrix
30+
min_dists = [[float('inf')] * n for _ in range(n)]
31+
# set to keep track of nodes with no edges
32+
no_edges = set(range(n))
33+
# add weights from edges list, remove nodes with edges from set
34+
for from_node, to_node, weight in edges:
35+
min_dists[from_node][to_node] = weight
36+
min_dists[to_node][from_node] = weight
37+
no_edges.discard(from_node)
38+
no_edges.discard(to_node)
39+
# if any nodes have no edges, return the one with the greatest ID
40+
if no_edges:
41+
return max(no_edges)
42+
# # set diagonal to 0 -- actually, not needed since we skip the diagonal
43+
# # in the main loop anyway
44+
# for i in range(n):
45+
# min_dists[i][i] = 0
46+
# run Floyd-Warshall
47+
for via_node in range(n):
48+
for from_node in range(n):
49+
for to_node in range(n):
50+
if from_node == to_node:
51+
continue
52+
dist_via_intermediate = min_dists[from_node][via_node] + min_dists[via_node][to_node]
53+
if dist_via_intermediate < min_dists[from_node][to_node]:
54+
min_dists[from_node][to_node] = dist_via_intermediate
55+
# find highest-numbered node with fewest reachable nodes within distanceThreshold
56+
min_reachable = n
57+
for node_id, dists in enumerate(min_dists):
58+
reachable = sum(1 for dist in dists if dist <= distanceThreshold)
59+
if reachable <= min_reachable:
60+
min_reachable = reachable
61+
min_reachable_node = node_id
62+
return min_reachable_node
63+
```
64+
65+
![](https://github.com/user-attachments/assets/f4fb89c1-536b-454e-a7e7-a89b8b9c4e20)

problems/2045/paxtonfitzpatrick.md

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# [Problem 2045: Second Minimum Time to Reach Destination](https://leetcode.com/problems/second-minimum-time-to-reach-destination/description/?envType=daily-question)
2+
3+
## Initial thoughts (stream-of-consciousness)
4+
5+
- okay, this one looks tricky. One initial thought I have is that a path that involves revisiting some node will be the second shortest path only if there aren't two paths that *don't* involve revisiting a node. So I think I can ignore that outside of those specific cases.
6+
- It sounds like we'll need to use some algorithm that finds *all* paths between a target and destination node. I know Djikstra's algorithm can be modified to terminate early upon encountering a target node, so maybe there's a way to modify it such that it terminates when it encounters that node a second time?
7+
8+
## Refining the problem, round 2 thoughts
9+
10+
### Other notes
11+
12+
## Attempted solution(s)
13+
14+
```python
15+
class Solution:
16+
def secondMinimum(self, n: int, edges: List[List[int]], time: int, change: int) -> int:
17+
18+
```

problems/2976/paxtonfitzpatrick.md

+94
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# [Problem 2976: Minimum Cost to Convert String I](https://leetcode.com/problems/minimum-cost-to-convert-string-i/description/?envType=daily-question)
2+
3+
## Initial thoughts (stream-of-consciousness)
4+
5+
- okay, so this is going to be another shortest path problem. Letters are nodes, corresponding indices in `original` and `changed` are directed edges, and those same indices in `cost` give their weights.
6+
- I was originally thinking I'd want to find all min distances between letters using something similar to yesterday's problem (Floyd-Warshall algorithm), but i actually think it'll be more efficient to figure out what letters we need to convert first and then searching just for those. So I think this is calling for Djikstra's algorithm.
7+
- so I'll loop through `source` and `target`, identify differences, and store source-letter + target-letter pairs.
8+
- if a source letter isn't in `original` or a target letter isn't in `changed`, I can immediately `return -1`
9+
- actually, I think I'll store the source and target letters as a dict where keys are source letters and values are lists (probably actually sets?) of target letters for that source letter. That way if I need to convert some "a" to a "b" and some other "a" to a "c", I can save time by combining those into a single Djikstra run.
10+
- then I'll run Djikstra's algorithm starting from each source letter and terminate when I've found paths to all target letters for it.
11+
- I'll write a helper function for Djikstra's algorithm that takes a source letter and a set of target letters, and returns a list (or some sort of container) of minimum costs to convert that source letter to each of the target letters.
12+
13+
---
14+
15+
- after thinking through how to implement Djikstra here a bit, I wonder if Floyd-Warshall might actually be more efficient... Floyd-Warshall's runtime scales with the number of nodes, but since nodes here are letters, we know there will always be 26 of them. So that's essentially fixed. Meanwhile Djikstra's runtime scales with the number of nodes *and* edges, and since the constraints say there can be upto 2,000 edges, we're likely to have a large number of edges relative to the number of nodes. That also means we're much more likely to duplicate operations during different runs of Djikstra than we would be if the graph were large and sparse. So I think I'll actually try Floyd-Warshall first.
16+
17+
## Refining the problem, round 2 thoughts
18+
19+
- we could reduce the size of the distance matrix for the Floyd-Warshall algorithm by including only the letters in `original` and `changed` instead of all 26. But I doubt this would be worth it on average, since it'd only sometimes reduce the number of nodes in the graph and always incur overhead costs of converting `original` and `changed` to sets, looping over letters and converting them to indices instead of looping over indices directly, etc.
20+
- speaking of which, I'll still have to loop over letters and convert them to indices in order to extract the conversion costs for mismatched letters, and I can think of two ways to do this:
21+
- store a letters/indices mapping in a `dict`, i.e. `{let: i for i, let in enumerate('abcdefghijklmnopqrstuvwxyz')}` and index it with each letter
22+
- use `ord(letter)` to get the letter's ASCII value and subtract 97 (ASCII value of "a") to get its index in the alphabet
23+
24+
Both operations would take constant time, but constructing the `dict` will use a little bit of additional memory so I think I'll go with the latter.
25+
- hmmm actually, if I can just use a dict as the letter/index mapping, that might make reducing the size of the distance matrix worth it. Maybe I'll try that if my first attempt is slow.
26+
- hmmm the problem notes that "*there may exist indices `i`, `j` such that `original[j] == original[i]` and `changed[j] == changed[i]`*". But it's not totally clear to me whether they're (A) simply saying that nodes may appear in both the `original` and `changed` lists multiple times because they can have multiple edges, or (B) saying that ***edges*** may be duplicated, potentially with different `cost` values -- i.e., `(original[j], changed[j]) == (original[i], changed[i])` but `cost[j] != cost[i]`. My guess is that it's the latter because the former seems like a sort of trivial point to make note of, so I'll want to account for this when I initialize the distance matrix.
27+
28+
## Attempted solution(s)
29+
30+
```python
31+
class Solution:
32+
def minimumCost(self, source: str, target: str, original: List[str], changed: List[str], cost: List[int]) -> int:
33+
# setup min distance/cost matrix
34+
INF = float('inf')
35+
min_costs = [[INF] * 26 for _ in range(26)]
36+
for orig_let, changed_let, c in zip(original, changed, cost):
37+
orig_ix, changed_ix = ord(orig_let) - 97, ord(changed_let) - 97
38+
if c < min_costs[orig_ix][changed_ix]:
39+
min_costs[orig_ix][changed_ix] = c
40+
# run Floyd-Warshall
41+
for via_ix in range(26):
42+
for from_ix in range(26):
43+
for to_ix in range(26):
44+
if min_costs[from_ix][via_ix] + min_costs[via_ix][to_ix] < min_costs[from_ix][to_ix]:
45+
min_costs[from_ix][to_ix] = min_costs[from_ix][via_ix] + min_costs[via_ix][to_ix]
46+
# compute total cost to convert source to target
47+
total_cost = 0
48+
for src_let, tgt_let in zip(source, target):
49+
if src_let != tgt_let:
50+
src_ix, tgt_ix = ord(src_let) - 97, ord(tgt_let) - 97
51+
if min_costs[src_ix][tgt_ix] == INF:
52+
return -1
53+
total_cost += min_costs[src_ix][tgt_ix]
54+
return total_cost
55+
```
56+
57+
![](https://github.com/user-attachments/assets/2df1bdf7-8f66-4d28-90f8-12998425b3ba)
58+
59+
Not bad. But I'm curious whether creating a graph from only the letters in `original` and `changed` would be faster. It's a quick edit, so I'll try it. Biggest change will be an additional `return -1` condition in the last loop to handle letters in `source` and `target` that can't be mapped to/from anything.
60+
61+
```python
62+
class Solution:
63+
def minimumCost(self, source: str, target: str, original: List[str], changed: List[str], cost: List[int]) -> int:
64+
# setup min distance/cost matrix
65+
INF = float('inf')
66+
letters = set(original) | set(changed)
67+
letters_ixs = {let: i for i, let in enumerate(letters)}
68+
len_letters = len(letters)
69+
min_costs = [[INF] * 26 for _ in range(len_letters)]
70+
for orig_let, changed_let, c in zip(original, changed, cost):
71+
if c < min_costs[letters_ixs[orig_let]][letters_ixs[changed_let]]:
72+
min_costs[letters_ixs[orig_let]][letters_ixs[changed_let]] = c
73+
# run Floyd-Warshall
74+
for via_ix in range(len_letters):
75+
for from_ix in range(len_letters):
76+
for to_ix in range(len_letters):
77+
if min_costs[from_ix][via_ix] + min_costs[via_ix][to_ix] < min_costs[from_ix][to_ix]:
78+
min_costs[from_ix][to_ix] = min_costs[from_ix][via_ix] + min_costs[via_ix][to_ix]
79+
# compute total cost to convert source to target
80+
total_cost = 0
81+
try:
82+
for src_let, tgt_let in zip(source, target):
83+
if src_let != tgt_let:
84+
if (change_cost := min_costs[letters_ixs[src_let]][letters_ixs[tgt_let]]) == INF:
85+
return -1
86+
total_cost += change_cost
87+
except KeyError:
88+
return -1
89+
return total_cost
90+
```
91+
92+
![](https://github.com/user-attachments/assets/263ad81c-900d-40d1-8602-ee5012e4b47e)
93+
94+
Wow, that made a much bigger difference than I expected!

0 commit comments

Comments
 (0)