Skip to content

容器内的进程无法创建新的线程 #493

@winrunwang

Description

@winrunwang

生产报错信息:

/opt/node/v22.2.0/bin/node[78]: std::unique_ptr node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start() at ../src/node_platform.cc

Assertion failed: (0) == (uv_thread_create(t.get(), start_thread, this))

Native stack trace:

1: 0xf66937 node::Assert(node::AssertionInfo const&) [/opt/node/v22.2.0/bin/node]
2: 0xf6f3e node::WorkerThreadsTaskRunner::WorkerThreadsTaskRunner(int) [/opt/node/v22.2.0/bin/node]
3: 0xfbc7b node::NodePlatform::NodePlatform(int, v8::TracingController*, v8::PageAllocator*) [/opt/node/v22.2.0/bin/node]
4: 0xf1d21c node::Start(int, char**) [/opt/node/v22.2.0/bin/node]
5: 0x7f300064f1ca __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
6: 0x7f300064f28b _start [/lib/x86_64-linux-gnu/libc.so.6]
7: 0xe67fee _start [/opt/node/v22.2.0/bin/node]
[I 2026-03-19 02:14:53.959 ServerApp] jupyter_lsp | extension was successfully linked.
[I 2026-03-19 02:14:53.968 ServerApp] jupyter_server_terminals | extension was successfully linked.
[W 2026-03-19 02:14:53.977 ServerApp] 'token' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before
[I 2026-03-19 02:14:53.977 ServerApp] jupyter_server_terminals | extension was successfully linked.
[I 2026-03-19 02:14:53.984 ServerApp] notebook | extension was successfully linked.
[I 2026-03-19 02:14:53.986 ServerApp] notebook | extension was successfully loaded.
[I 2026-03-19 02:14:54.590 ServerApp] Writing Jupyter server cookie secret to /root/.local/share/jupyter/runtime/jupyter_cookie_secret
[I 2026-03-19 02:14:54.590 ServerApp] notebook_shim | extension was successfully linked.
[I 2026-03-19 02:14:54.523 ServerApp] jupyter_lsp | extension was successfully loaded.
[I 2026-03-19 02:14:54.525 ServerApp] jupyter_server_terminals | extension was successfully loaded.
[I 2026-03-19 02:14:54.530 LabApp] JupyterLab extension loaded from /opt/python/versions/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/site-packages/jupyterlab
[I 2026-03-19 02:14:54.530 LabApp] JupyterLab application directory is /opt/python/versions/cpython-3.12.12-linux-x86_64-gnu/share/jupyter/lab
[I 2026-03-19 02:14:54.531 LabApp] Extension Manager is 'pippi'.
[I 2026-03-19 02:14:54.597 ServerApp] jupyterlab | extension was successfully loaded.
[I 2026-03-19 02:14:54.607 ServerApp] notebook | extension was successfully loaded.
[I 2026-03-19 02:14:54.608 ServerApp] Serving notebooks from local directory: /workspace
[I 2026-03-19 02:14:54.608 ServerApp] Jupyter Server 2.17.0 is running at:
[I 2026-03-19 02:14:54.608 ServerApp] http://127.0.0.1:44771/tree?token=...
[I 2026-03-19 02:14:54.608 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[E 2026-03-19 02:14:54.617 ServerApp] Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOMainLoop
<Task finished name='Task-2' coro=<initialize() done, defined at /opt/python/versions/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/site-packages/jupyter_lsp/server
TimeoutError("can't start new thread")>
Traceback (most recent call last):
File "/opt/python/versions/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/site-packages/tornado/ioloop.py", line 758, in _run_callback
ret = callback()
File "/opt/python/versions/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/site-packages/tornado/ioloop.py", line 782, in _discard_future_result
future.result()
File "/opt/python/versions/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/site-packages/jupyter_lsp/serverextension.py", line 24, in initialize
await nbapp.io_loop.run_in_executor(pool, manager.initialize)
File "/opt/python/versions/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/site-packages/tornado/platform/asyncio.py", line 272, in run_in_executor
return self.asyncio_loop.run_in_executor(executor, func, *args)
File "/opt/python/versions/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 867, in run_in_executor
executor.submit(func, *args), loop=self)
File "/opt/python/versions/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/concurrent/futures/thread.py", line 180, in submit
self._adjust_thread_count()
File "/opt/python/versions/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/concurrent/futures/thread.py", line 203, in _adjust_thread_count
t.start()
File "/opt/python/versions/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/threading.py", line 994, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread

使用ai工具分析得到的结论如下:

从你提供的 Docker 日志来看,核心问题非常明确:容器内的进程无法创建新的线程

错误分析

日志中重复出现了以下关键错误信息:

  1. Go 语言运行时崩溃
    runtime/cgo: pthread_create failed: Operation not permitted
    这表明 Go 程序试图调用底层系统函数 pthread_create 来创建线程时,被操作系统(Linux)拒绝,返回了“操作不允许”的错误。
  2. Node.js 崩溃
    Assertion failed: (0) == (uv_thread_create(t.get(), start_thread, this))
    Node.js 基于 libuv,同样在尝试创建工作线程时失败,导致断言失败并退出。
  3. Python/Jupyter 异常
    RuntimeError: can't start new thread
    Python 的 threading 模块或异步执行器在尝试启动新线程时抛出异常。

结论:这不是代码逻辑错误,而是Docker 容器的安全限制或资源限制导致的。你的容器环境缺少创建线程所需的权限(Capabilities),或者受到了严格的 Cgroups 限制。

  1. 我使用的是
    opensandbox-server==0.1.2
    code-interpreter 是根据v1.0.1 然后加入了一点业务需要使用的包自己构建的。
    部署方式目前没上k8s。这个还在研究,相当于是使用单docker方式运行在云主机。后续还是会走k8s的模式部署生产。
    其他配置均为默认的。几乎没有什么改动。
    目前在开发环境没有出现过这个问题。测试环境偶尔会出现。生产环境因为调用量是最大的,频繁出现

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions