WIP: Optimize Maze Generation #856

jbdyn · 2025-03-10T09:04:25Z

Hey guys,

analogous to #852, I also tried to use the first version of the optimized chamber finding algorithm in the maze generation as well.

The timings went from several seconds down to less than half a second consistently for script execution.
However, looking at the profiling, only 5% (25 ms) are actually maze generation and the rest of the time was spent in importing:

absolute times

  _     ._   __/__   _ _  _  _ _/_   Recorded: 09:47:34  Samples:  233
 /_//_/// /_\ / //_// / //_'/ //     Duration: 0.480     CPU time: 1.285
/   _/                      v5.0.1

Program: scripts/pelita_createlayout.py

0.479 <module>  pelita_createlayout.py:1
├─ 0.299 <module>  ../__init__.py:1
│  └─ 0.297 <module>  ../game.py:1
│     ├─ 0.183 <module>  ../team.py:1
│     │  └─ 0.181 <module>  networkx/__init__.py:1
│     │        [18 frames hidden]  networkx, importlib
│     ├─ 0.074 <module>  ../viewer.py:1
│     │  ├─ 0.062 <module>  rich/console.py:1
│     │  │     [11 frames hidden]  rich, fractions
│     │  └─ 0.010 <module>  rich/progress.py:1
│     ├─ 0.030 <module>  ../network.py:1
│     │  └─ 0.028 <module>  zmq/__init__.py:1
│     │        [10 frames hidden]  zmq, importlib, enum
│     └─ 0.006 <module>  logging/__init__.py:1
├─ 0.146 <module>  ../maze_generator.py:1
│  └─ 0.146 <module>  numpy/__init__.py:1
│        [30 frames hidden]  numpy, typing
├─ 0.024 main  pelita_createlayout.py:65
│  └─ 0.022 get_new_maze  ../maze_generator.py:381
│     ├─ 0.012 remove_all_chambers  ../maze_generator.py:321
│     │  └─ 0.006 articulation_points  networkx/algorithms/components/biconnected.py:263
│     │     └─ 0.006 _biconnected_dfs  networkx/algorithms/components/biconnected.py:338
│     └─ 0.010 remove_all_dead_ends  ../maze_generator.py:256
│        └─ 0.008 walls_to_graph  ../maze_generator.py:193
└─ 0.008 compile  <built-in>

relative times

  _     ._   __/__   _ _  _  _ _/_   Recorded: 09:47:34  Samples:  233
 /_//_/// /_\ / //_// / //_'/ //     Duration: 0.480     CPU time: 1.285
/   _/                      v5.0.1

Program: scripts/pelita_createlayout.py

100.0% <module>  pelita_createlayout.py:1
├─ 62.4% <module>  ../__init__.py:1
│  └─ 62.0% <module>  ../game.py:1
│     ├─ 38.2% <module>  ../team.py:1
│     │  └─ 37.8% <module>  networkx/__init__.py:1
│     │        [18 frames hidden]  networkx, importlib
│     ├─ 15.4% <module>  ../viewer.py:1
│     │  ├─ 12.9% <module>  rich/console.py:1
│     │  │     [11 frames hidden]  rich, fractions
│     │  └─ 2.1% <module>  rich/progress.py:1
│     ├─ 6.3% <module>  ../network.py:1
│     │  └─ 5.8% <module>  zmq/__init__.py:1
│     │        [10 frames hidden]  zmq, importlib, enum
│     └─ 1.3% <module>  logging/__init__.py:1
├─ 30.5% <module>  ../maze_generator.py:1
│  └─ 30.5% <module>  numpy/__init__.py:1
│        [30 frames hidden]  numpy, typing
├─ 5.0% main  pelita_createlayout.py:65
│  └─ 4.6% get_new_maze  ../maze_generator.py:381
│     ├─ 2.5% remove_all_chambers  ../maze_generator.py:321
│     │  └─ 1.3% articulation_points  networkx/algorithms/components/biconnected.py:263
│     │     └─ 1.3% _biconnected_dfs  networkx/algorithms/components/biconnected.py:338
│     └─ 2.1% remove_all_dead_ends  ../maze_generator.py:256
│        └─ 1.7% walls_to_graph  ../maze_generator.py:193
└─ 1.7% compile  <built-in>

Would that be fast enough to generate mazes on-the-fly?

For now, I think this would not make the layout database obsolete as one still does not have direct control over the number of dead ends and chambers.

jbdyn · 2025-03-10T09:15:17Z

Related discussion about layout database design: #849

otizonaizit · 2025-03-10T09:17:31Z

that is quite impressive. Did you verify that the new algorithm and the old one generate exactly the same maze if started with the same random seed?

otizonaizit · 2025-03-10T09:19:59Z

Half a second is too much to be run at every game, but if we don't require to remove dead ends and chambers anymore, the whole thing would be even faster, no? And in that case we could relax our requirements. We could hard code that we want to "trap" a maximum of 33% of food pellets in chambers/dead-ends, and then do our best depending on how many "trapped" tile we have available on the fly.

otizonaizit · 2025-03-10T09:21:25Z

that is quite impressive. Did you verify that the new algorithm and the old one generate exactly the same maze if started with the same random seed?

Given the failing test it seems to me that the change also changes the generated maze. As I said, if we relax our requirements we may not need to remove chambers/dead-ends, so the test failure would be irrelevant.

jbdyn · 2025-03-10T09:27:11Z

Did you verify that the new algorithm and the old one generate exactly the same maze if started with the same random seed?

No, but for that I would need to align pelita_createlayout.py and maze_generator.get_new_maze() as the first one takes a seed kwarg and the second one does not, but instead a rng object.
Have not looked into this yet.

Half a second is too much to be run at every game

Okay, but most of this time is importing pelita modules which are not required at all in the maze generation.
This is due to the root __init__.py.

Given the failing test it seems to me that the change also changes the generated maze.

The tests fail because I removed find_chamber and introduced a new find_chambers (note the s).

otizonaizit · 2025-03-10T10:42:28Z

Half a second is too much to be run at every game, but if we don't require to remove dead ends and chambers anymore, the whole thing would be even faster, no? And in that case we could relax our requirements. We could hard code that we want to "trap" a maximum of 33% of food pellets in chambers/dead-ends, and then do our best depending on how many "trapped" tile we have available on the fly.

@Debilski : what do you think of this approach? I would try together with @jbdyn to implement it on Wednesday in a monster PR. Plan:

maze generation on the fly
configurable proportion of "trapped" food, intended as a best effort: if there are not enough trapped tiles, then less food will be trapped than requested
remove the functionality to remove chambers/dead-ends (could be kept somewhere commented out so taht we don't have to re-invent the wheel in the future

Debilski · 2025-03-10T10:48:41Z

@jbdyn Could you rebase this branch on the updated main? This will make comparisons easier.

Debilski · 2025-03-10T13:06:10Z

Let me see if I can fix #854 before Wednesday as this is probably useful for testing (although shell redirection will already do the job well).

Can we be certain though that the same maze will be generated everywhere? And even if this is the case, I see a few UX problems:

Currently, we have 1000 mazes and it is always clear which one is used.
This makes it less overwhelming for the students than having billions of mazes and they can pick any maze and easily discuss it.
They can also easily copy and paste it and maybe modify it to test something specific. A maze is always in a file (or in a string).
If we want to change the 1000 mazes, we can simply recreate new ones and ship them.

Cons in the new approach:

Live-generated mazes can only have a seed as an id. The seed must be short enough to be useful.
If students want to modify a maze, they will have to pipe it into a file or use special Python code to read and save it.
If we want to change which maze is generated from --seed 1, we have to hard-code a salt value into Pelita (which is used to generate the real seed) and change that in code.

Not unsolvable and maybe not too bad (I dislike the salt here, though. Maybe you have a better idea.) but I wanted to note these things.

jbdyn · 2025-03-10T14:36:34Z

Could you rebase this branch on the updated main? This will make comparisons easier.

@Debilski Done.

otizonaizit · 2025-03-10T17:22:31Z

Can we be certain though that the same maze will be generated everywhere? And even if this is the case, I see a few UX problems:

Currently, we have 1000 mazes and it is always clear which one is used.

This makes it less overwhelming for the students than having billions of mazes and they can pick any maze and easily discuss it.

in my experience everyone has always assumed that the mazes are generated at run time. The team that asked about it was very surprised when I said that they are indeed pre-generated and that was the moment when they chose to hardcode behavior dependent on the layout name. We decided to have 1000 mazes to make it impossible to do this. Having mazes generated at run time will not surprise anyone, as far as my experience tells me.

They can also easily copy and paste it and maybe modify it to test something specific. A maze is always in a file (or in a string).

But that is still possible:

pelita --seed XXX --dump-layout /tmp/my.layout

With --seed you replay on the same maze with the same food. With --dump-layout you save to a string, modify the layout at will and reload later with

pelita --layout /tmp/my.layout

Given that my.layout will contain the food, you will be able to play on exactly the situation you want to have. You have full control.

If we want to change the 1000 mazes, we can simply recreate new ones and ship them.

But hey, there's not need to do it if maze generation is fast.

Cons in the new approach:

Live-generated mazes can only have a seed as an id. The seed must be short enough to be useful.

You lost me here. Why does a maze need an id?

If students want to modify a maze, they will have to pipe it into a file or use special Python code to read and save it.

See the --dump-layout option above

If we want to change which maze is generated from --seed 1, we have to hard-code a salt value into Pelita (which is used to generate the real seed) and change that in code.

I am even more lost here. Why do we care about --seed 1?

Debilski · 2025-03-10T18:15:15Z

in my experience everyone has always assumed that the mazes are generated at run time. The team that asked about it was very surprised when I said that they are indeed pre-generated and that was the moment when they chose to hardcode behavior dependent on the layout name. We decided to have 1000 mazes to make it impossible to do this. Having mazes generated at run time will not surprise anyone, as far as my experience tells me.

Survivorship bias? Maybe teams that were not confused about the layouts never asked? 🙃

The problem I am describing boils down to: How do two separate groups in a team communicate which layout they should use for testing.

The first group notices a problem and tells the other group that they should compare it with their own implementation or whatever.

Currently: The first group has the UI open, sees the layout name there and communicates this to the second group.

With auto-generated layouts, they must scroll back to find the seed on the command line (buried in lines of debugging output). And then they must recite something like 20 numbers to the other group so that they can use this as a seed to have the same layout. (And this assumes that this is actually stable between different computers.)

My suggested solution here was to generate a short id from the main seed that is given to the layout generator. This id would be shown in the UI and could be used for communication. (The second group would for example use --layout 123 to access the layout that the first group sees.) The drawback here is that all layouts with nice seeds are then fixed forever (or until the maze-generating algorithm changes), hence my suggestion to add a salt. But I am not a fan of this idea either.

Obviously people can save and send around layout files, but this makes things quite a bit more involved compared to what we had before.

Some thinking later:

What we could do is suggest to the teams that they pre-generate their own set of mazes for themselves in their group repo and whenever pelita is run with pelita --layout ./folder, it will draw a random layout from that folder. Then they can decide on their own naming scheme.

Potential breakage, though: There will be this one team who does this and generates only three mazes and then losing because they made some hard-coded assumptions about these mazes.

Debilski · 2025-03-10T18:22:20Z

I start to like the pregen idea. If we had a command that does this automatically, we could tell the teams to use it on the first day or so:

> pelita-gen-layout --count 1000 ./layout_folder
Generating 1000 layouts in ./layout_folder. Re-run with --seed 1244546245
.............done.

And the layout_folder then contains ./layout_folder/0000.layout or something like that.

otizonaizit · 2025-03-10T19:31:52Z

Really, I have never ever seen anyone being obsessed about replaying on the same maze. When this happened, it was always connected to reusing the very same seed, i.e. replaying the very same game. I have seen seeds saved in files and communicated through git. Never layout names. Even in the last TU course, a group had a collection of seeds to explore because of problems with the corresponding games. Having the same layout but a different seed seems to me a very exotic debugging configuration (which is of course still achievable, just not by using an id shortcut).

If there is a very particular maze where something absurd happens, then you reuse the seed and use --dump-layout to replay with different seeds, but again, I have a hard time imagining when this may be relevant.

Debilski · 2025-03-11T16:39:55Z

I have seen seeds saved in files and communicated through git. Never layout names.

That’s my point, though. There is no need to write down a layout name or put it into git, they would just shout it over the table.

But I was just listing the differences that I see. If you think they can be neglected then I am fine with that.

Optimize finding chambers and walls_to_graph

b01cbc5

Make tests for maze generation pass

1e8f662

Debilski mentioned this pull request Mar 10, 2025

Fix maze_generator still using old seed argument to get_new_maze #857

Merged

otizonaizit mentioned this pull request Mar 10, 2025

introduce a database of mazes with descriptive metadata, for example number and length of dead-ends, etc. #849

Open

Merge branch 'main' into optimize-maze-generator

9929424

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Optimize Maze Generation #856

WIP: Optimize Maze Generation #856

jbdyn commented Mar 10, 2025 •

edited

Loading

jbdyn commented Mar 10, 2025

otizonaizit commented Mar 10, 2025

otizonaizit commented Mar 10, 2025

otizonaizit commented Mar 10, 2025

jbdyn commented Mar 10, 2025 •

edited

Loading

otizonaizit commented Mar 10, 2025

Debilski commented Mar 10, 2025

Debilski commented Mar 10, 2025

jbdyn commented Mar 10, 2025

otizonaizit commented Mar 10, 2025 •

edited

Loading

Debilski commented Mar 10, 2025

Debilski commented Mar 10, 2025

otizonaizit commented Mar 10, 2025 •

edited

Loading

Debilski commented Mar 11, 2025

WIP: Optimize Maze Generation #856

Are you sure you want to change the base?

WIP: Optimize Maze Generation #856

Conversation

jbdyn commented Mar 10, 2025 • edited Loading

jbdyn commented Mar 10, 2025

otizonaizit commented Mar 10, 2025

otizonaizit commented Mar 10, 2025

otizonaizit commented Mar 10, 2025

jbdyn commented Mar 10, 2025 • edited Loading

otizonaizit commented Mar 10, 2025

Debilski commented Mar 10, 2025

Debilski commented Mar 10, 2025

jbdyn commented Mar 10, 2025

otizonaizit commented Mar 10, 2025 • edited Loading

Debilski commented Mar 10, 2025

Debilski commented Mar 10, 2025

otizonaizit commented Mar 10, 2025 • edited Loading

Debilski commented Mar 11, 2025

jbdyn commented Mar 10, 2025 •

edited

Loading

jbdyn commented Mar 10, 2025 •

edited

Loading

otizonaizit commented Mar 10, 2025 •

edited

Loading

otizonaizit commented Mar 10, 2025 •

edited

Loading