Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel Stop Fix #468

Closed

Conversation

rakhmaevao
Copy link

Hello. I found this floating problem. If you start sessions and stop them, then the 500th error occurs (see the tracing and video below).

Screencast.from.2025-03-08.14-37-26.mp4
  File "/home/rakhmaevao/Projects/jupyverse/plugins/kernels/fps_kernels/kernel_server/server.py", line 137, in start
    await self._wait_for_ready()
  File "/home/rakhmaevao/Projects/jupyverse/plugins/kernels/fps_kernels/kernel_server/server.py", line 233, in _wait_for_ready
    msg = await receive_message(self.iopub_channel, timeout=0.2)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rakhmaevao/Projects/jupyverse/plugins/kernels/fps_kernels/kernel_driver/message.py", line 136, in receive_message
    ready = await sock.apoll(timeout).wait()
                  ^^^^^^^^^^^^^^^^^^^
  File "/home/rakhmaevao/.local/share/hatch/env/virtual/jupyverse/u_XNhuOG/dev.jupyterlab-auth/lib/python3.12/site-packages/zmq_anyio/_socket.py", line 503, in apoll
    self._check_started()
  File "/home/rakhmaevao/.local/share/hatch/env/virtual/jupyverse/u_XNhuOG/dev.jupyterlab-auth/lib/python3.12/site-packages/zmq_anyio/_socket.py", line 939, in _check_started
    self._task_group.start_soon(self._start)
  File "/home/rakhmaevao/.local/share/hatch/env/virtual/jupyverse/u_XNhuOG/dev.jupyterlab-auth/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 880, in start_soon
    self._spawn(func, args, name)
  File "/home/rakhmaevao/.local/share/hatch/env/virtual/jupyverse/u_XNhuOG/dev.jupyterlab-auth/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 832, in _spawn
    raise RuntimeError(
RuntimeError: This task group is not active; no new tasks can be started

I fixed this error (see plugins/kernels/fps_kernels/kernel_server/server.py). I guarantee that sockets will close. Now the problem is gone.

To eliminate this problem in the future, I wrote a test. This test is e2e. That's why I put it in a separate package. I want to expand the tests in the future. I have a few more cases that I found. But first I want to make sure that you like this way of fixing bugs. The method is to write tests.

I also added the CONTRIBUTING.md file. I think this will help the community.

I also deleted the .fileid.db file and moved it to .gitignore. It may have ended up in the index by accident.

@rakhmaevao
Copy link
Author

@davidbrochart, Hi, I think this is an important fix. Maybe you could take a look?

@rakhmaevao
Copy link
Author

If you don't like my tests, maybe we can at least merge the fix itself?

plugins/kernels/fps_kernels/kernel_server/server.py

@davidbrochart

@davidbrochart
Copy link
Collaborator

I didn't have time to look at this yet, please bear with me.

await tg.start(self.shell_channel.stop)
await tg.start(self.stdin_channel.stop)
await tg.start(self.control_channel.stop)
await tg.start(self.iopub_channel.stop)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this fix the bug?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. Sorry for being pushy. I don't want to seem like a bore. But I really want to contribute to this project.

The solution is to ensure that sockets are closed when the function is finished. Otherwise, we would exit the taskgroup without waiting for it to finish. This will allow them to be opened correctly when restarting.

My test passes on my commit. It fails on the main branch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your interest in the project, but I don't think your PR fixes the bug.
I'm working on a fix, I will open a PR soon.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. Then I will close this PR. And I will wait for yours. I will be able to run my test on it, so that you can be sure of the fact of the correction.

@rakhmaevao rakhmaevao closed this Mar 25, 2025
@rakhmaevao rakhmaevao mentioned this pull request Mar 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants