Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

undefined method superviseesfor nil #430

Open
wdiechmann opened this issue Nov 27, 2024 · 17 comments
Open

undefined method superviseesfor nil #430

wdiechmann opened this issue Nov 27, 2024 · 17 comments
Assignees

Comments

@wdiechmann
Copy link

https://github.com/rails/solid_queue/blame/c521c6d47d10abeaf076a81d9af29e80eb2b11c7/lib/solid_queue/supervisor.rb#L174

probably should look something like process&.supervisees.find_by... or

if registered_process = process.supervisees.find_by(name: terminated_fork.name) rescue false
@rosa
Copy link
Member

rosa commented Nov 28, 2024

Huh, yes, but I'm curious to know when you reach this line with process being nil 🤔 In theory, this should never be executed before the supervisor has been registered. I must be missing a scenario where this might happen, but not sure what it'd be. Do you have more context about when you hit this?

@wdiechmann
Copy link
Author

I got the background job to work - 99% thx to you - and then I let it rip for a few hours - and once I got back to my desk, the log told me that there where no supervisees hanging out with the nil process 😉

I'll make sure to try to sample better context when this happens again - if it does

@rosa rosa self-assigned this Dec 6, 2024
@HashNuke
Copy link

HashNuke commented Jan 13, 2025

Just found this issue. I only have the occassional jobs to send an email. Nothing that adds to load.

Environment information

  • Development environment
  • Procfile has this line to start jobs -jobs: bin/jobs start
  • The only config related to solidqueue in my development.rb are these lines
  config.active_job.queue_adapter = :solid_queue
  config.solid_queue.connects_to = { database: { writing: :queue } }

Log output

10:13:45 jobs.1        | /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:174:in `handle_claimed_jobs_by': undefined method `supervisees' for nil (NoMethodError)
10:13:45 jobs.1        |
10:13:45 jobs.1        |         if registered_process = process.supervisees.find_by(name: terminated_fork.name)
10:13:45 jobs.1        |                                        ^^^^^^^^^^^^
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:166:in `block in replace_fork'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/activesupport-8.0.0/lib/active_support/notifications.rb:210:in `block in instrument'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/activesupport-8.0.0/lib/active_support/notifications/instrumenter.rb:58:in `instrument'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/activesupport-8.0.0/lib/active_support/notifications.rb:210:in `instrument'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue.rb:67:in `instrument'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:163:in `replace_fork'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:143:in `block in reap_and_replace_terminated_forks'
10:13:45 jobs.1        | 	from <internal:kernel>:187:in `loop'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:139:in `reap_and_replace_terminated_forks'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:67:in `block in supervise'
10:13:45 jobs.1        | 	from <internal:kernel>:187:in `loop'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:60:in `supervise'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:36:in `start'
10:13:45 jobs.1        | 	from <internal:kernel>:90:in `tap'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:14:in `start'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/cli.rb:26:in `start'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/thor-1.3.2/lib/thor/command.rb:28:in `run'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/thor-1.3.2/lib/thor/invocation.rb:127:in `invoke_command'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/thor-1.3.2/lib/thor.rb:538:in `dispatch'
10:13:45 jobs.1        | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/thor-1.3.2/lib/thor/base.rb:584:in `start'
10:13:45 jobs.1        | 	from bin/jobs:6:in `<main>'
10:13:46 jobs.1        | exited with code 1

@HashNuke
Copy link

HashNuke commented Jan 13, 2025

I just got this error again, this time I saved some additional events from the logs immediately before the error.

Just posting it here so that I can come back to debug this later.

13:22:17 web.1         | SolidQueue-1.1.2 Shutdown Dispatcher (1.6ms)  pid: 79216, hostname: "wheeljack", process_id: nil, name: "dispatcher-a281b20f5a6841dec145", polling_interval: 1, batch_size: 500, concurrency_maintenance_interval: 600
13:22:17 web.1         |   SolidQueue::Semaphore Pluck (27.2ms)  SELECT "solid_queue_semaphores"."id" FROM "solid_queue_semaphores" WHERE "solid_queue_semaphores"."expires_at" < '2025-01-13 06:22:17.669247' ORDER BY "solid_queue_semaphores"."id" ASC LIMIT 500 /*application='ExampleApp'*/
13:22:18 web.1         | SolidQueue-1.1.2 Replaced terminated Worker (4.8ms)  pid: 79217, status: 0, pid_from_status: 79217, signaled: false, stopsig: nil, termsig: nil, hostname: "wheeljack", name: "worker-4d5487c9585de0b0676e"
13:22:18 web.1         | SolidQueue-1.1.2 Shutdown Supervisor (1.6ms)  pid: 79205, hostname: "wheeljack", process_id: nil, name: "supervisor-58a3a355f1fe2b033ab7"
13:22:18 web.1         | ** [Honeybadger] Reporting error id=55300755-99d1-47b2-a451-2a65d87e4074 level=1 pid=79205
13:22:18 web.1         | ** [Honeybadger] Success ⚡ Development mode is enabled; this error will be reported if it occurs after you deploy your app. id=55300755-99d1-47b2-a451-2a65d87e4074 level=1 pid=79205
13:22:18 web.1         | /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:174:in `handle_claimed_jobs_by': undefined method `supervisees' for nil (NoMethodError)
13:22:18 web.1         |
13:22:18 web.1         |         if registered_process = process.supervisees.find_by(name: terminated_fork.name)
13:22:18 web.1         |                                        ^^^^^^^^^^^^
13:22:18 web.1         | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:166:in `block in replace_fork'
13:22:18 web.1         | 	from /Users/akash/.asdf/installs/ruby/3.3.5/lib/ruby/gems/3.3.0/gems/activesupport-8.0.0/lib/active_support/notifications.rb:210:in `block in instrument'

... the same error as before ...

@rosa
Copy link
Member

rosa commented Jan 13, 2025

Thanks a lot for these extra logs, @HashNuke! These are very helpful as I haven't been able to reproduce this 🙏

@tahirwaseer
Copy link

@rosa I'm also running into this quite frequently, maybe three or four times daily. Posting the logs if they are helpful.

Setup:

  • solid-queue-1.1.0
  • rails 7.1.2
  • ruby 3.2.0
  • Docker container
  • env=development
  • enqued 1000s of jobs
  • I'm running 3-4 workers with this config for each of them
default: &default
  dispatchers:
    - polling_interval: 1
      batch_size: 500
  workers:
    - queues: "*"
      threads: 3 # You can manage these threads by altering the connection pool for puma, n-2 is the formula is followed for solid queue threads, where n is the no of connections for puma
      processes: <%= ENV.fetch("JOB_CONCURRENCY", 1) %>
      polling_interval: 0.1

I'll try with the latest release later.

logs

/usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:174:in `handle_claimed_jobs_by': undefined method `supervisees' for nil:NilClass (NoMethodError)

        if registered_process = process.supervisees.find_by(name: terminated_fork.name)
                                       ^^^^^^^^^^^^
	from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:166:in `block in replace_fork'
	from /usr/local/bundle/gems/activesupport-7.1.2/lib/active_support/notifications.rb:206:in `block in instrument'
	from /usr/local/bundle/gems/activesupport-7.1.2/lib/active_support/notifications/instrumenter.rb:58:in `instrument'
	from /usr/local/bundle/gems/activesupport-7.1.2/lib/active_support/notifications.rb:206:in `instrument'
	from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue.rb:67:in `instrument'
	from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:163:in `replace_fork'
	from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:143:in `block in reap_and_replace_terminated_forks'
	from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:139:in `loop'
	from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:139:in `reap_and_replace_terminated_forks'
	from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:67:in `block in supervise'
	from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:60:in `loop'
	from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:60:in `supervise'
	from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:36:in `start'
	from <internal:kernel>:90:in `tap'
	from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:14:in `start'
	from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/cli.rb:26:in `start'
	from /usr/local/bundle/gems/thor-1.3.2/lib/thor/command.rb:28:in `run'
	from /usr/local/bundle/gems/thor-1.3.2/lib/thor/invocation.rb:127:in `invoke_command'
	from /usr/local/bundle/gems/thor-1.3.2/lib/thor.rb:538:in `dispatch'
	from /usr/local/bundle/gems/thor-1.3.2/lib/thor/base.rb:584:in `start'
	from bin/jobs:6:in `<main>'

/usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:81: [BUG] Segmentation fault at 0x0000000000087082
ruby 3.2.0 (2022-12-25 revision a528908271) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0015 p:---- s:0082 e:000081 CFUNC  :fork
c:0014 p:0013 s:0078 e:000077 METHOD /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:81
c:0013 p:0005 s:0071 e:000070 BLOCK  /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:56 [FINISH]
c:0012 p:---- s:0067 e:000066 CFUNC  :each
c:0011 p:0008 s:0063 e:000062 METHOD /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:56
c:0010 p:0011 s:0059 e:000058 METHOD /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:33
c:0009 p:0003 s:0055 e:000054 METHOD <internal:kernel>:90
c:0008 p:0033 s:0051 e:000050 METHOD /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:14
c:0007 p:0009 s:0045 e:000044 METHOD /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/cli.rb:26
c:0006 p:0054 s:0041 e:000040 METHOD /usr/local/bundle/gems/thor-1.3.2/lib/thor/command.rb:28
c:0005 p:0040 s:0033 e:000032 METHOD /usr/local/bundle/gems/thor-1.3.2/lib/thor/invocation.rb:127
c:0004 p:0213 s:0026 e:000025 METHOD /usr/local/bundle/gems/thor-1.3.2/lib/thor.rb:538
c:0003 p:0044 s:0013 e:000012 METHOD /usr/local/bundle/gems/thor-1.3.2/lib/thor/base.rb:584
c:0002 p:0018 s:0006 e:000005 EVAL   bin/jobs:6 [FINISH]
c:0001 p:0000 s:0003 E:0002f0 DUMMY  [FINISH]

-- Ruby level backtrace information ----------------------------------------
bin/jobs:6:in `<main>'
/usr/local/bundle/gems/thor-1.3.2/lib/thor/base.rb:584:in `start'
/usr/local/bundle/gems/thor-1.3.2/lib/thor.rb:538:in `dispatch'
/usr/local/bundle/gems/thor-1.3.2/lib/thor/invocation.rb:127:in `invoke_command'
/usr/local/bundle/gems/thor-1.3.2/lib/thor/command.rb:28:in `run'
/usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/cli.rb:26:in `start'
/usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:14:in `start'
<internal:kernel>:90:in `tap'
/usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:33:in `start'
/usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:56:in `start_processes'
/usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:56:in `each'
/usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:56:in `block in start_processes'
/usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:81:in `start_process'
/usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:81:in `fork'

-- Machine register context ------------------------------------------------
 RIP: 0x00002aaaab5cab23 RBP: 0x000055555555dfd0 RSP: 0x00002aaaab2aad30
 RAX: 0x0000000000000000 RBX: 0x00002aaad2d82490 RCX: 0x0000000000000000
 RDX: 0x000000000008707a RDI: 0x00002aaad2d82490 RSI: 0x000055555555dfd0
  R8: 0x0000000000000db8  R9: 0x00002aab43854e50 R10: 0x00002aab43854e40
 R11: 0x0000555556979cb0 R12: 0x00002aab2c35ee20 R13: 0x00002aaaab607610
 R14: 0x00002aaaab9236a0 R15: 0x00005555555b32c0 EFL: 0x0000000000000202

-- C level backtrace information -------------------------------------------
/usr/local/lib/libruby.so.3.2(rb_print_backtrace+0x11) [0x2aaaab63258f] vm_dump.c:785
/usr/local/lib/libruby.so.3.2(rb_vm_bugreport) vm_dump.c:1080
/usr/local/lib/libruby.so.3.2(rb_bug_for_fatal_signal+0xf0) [0x2aaaab42d8c0] error.c:813
/usr/local/lib/libruby.so.3.2(sigsegv+0x49) [0x2aaaab5847d9] signal.c:964
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x2aaaabb74140]
/usr/local/lib/libruby.so.3.2(ccan_list_del_+0x0) [0x2aaaab5cab23] thread_sync.c:453
/usr/local/lib/libruby.so.3.2(ccan_list_del_init_) ccan/list/list.h:298
/usr/local/lib/libruby.so.3.2(rb_mutex_unlock_th) thread_sync.c:451
/usr/local/lib/libruby.so.3.2(rb_threadptr_unlock_all_locking_mutexes+0x30) [0x2aaaab5d03d2] thread_sync.c:443
/usr/local/lib/libruby.so.3.2(rb_thread_terminate_all) thread.c:451
/usr/local/lib/libruby.so.3.2(rb_multi_ractor_p+0x0) [0x2aaaab532329] ractor.c:1958
/usr/local/lib/libruby.so.3.2(rb_vm_lock) vm_sync.h:56
/usr/local/lib/libruby.so.3.2(rb_ractor_terminate_all) ractor.c:1960
/usr/local/lib/libruby.so.3.2(rb_ec_cleanup+0x1f9) [0x2aaaab438969] eval.c:239
/usr/local/lib/libruby.so.3.2(ruby_stop+0x9) [0x2aaaab438c39] eval.c:300
/usr/local/lib/libruby.so.3.2(rb_f_fork+0x1f) [0x2aaaab52b1b4] process.c:4446
/usr/local/lib/libruby.so.3.2(rb_f_fork) process.c:4436
/usr/local/lib/libruby.so.3.2(vm_cfp_consistent_p+0x0) [0x2aaaab60772c] vm_insnhelper.c:3252
/usr/local/lib/libruby.so.3.2(vm_call_cfunc_with_frame) vm_insnhelper.c:3254
/usr/local/lib/libruby.so.3.2(vm_sendish+0xcc) [0x2aaaab60d35c] vm_insnhelper.c:5064
/usr/local/lib/libruby.so.3.2(vm_exec_core+0x1cf) [0x2aaaab6179bf] insns.def:801
/usr/local/lib/libruby.so.3.2(rb_vm_exec+0x5aa) [0x2aaaab61d38a] vm.c:2383
/usr/local/lib/libruby.so.3.2(rb_yield+0x1e2) [0x2aaaab621a82] vm.c:1398
/usr/local/lib/libruby.so.3.2(RB_FL_TEST_RAW+0x0) [0x2aaaab39548c] array.c:2740
/usr/local/lib/libruby.so.3.2(RB_FL_ANY_RAW) ./include/ruby/internal/fl_type.h:552
/usr/local/lib/libruby.so.3.2(rb_array_len) ./include/ruby/internal/core/rarray.h:321
/usr/local/lib/libruby.so.3.2(rb_ary_each) array.c:2739
/usr/local/lib/libruby.so.3.2(vm_cfp_consistent_p+0x0) [0x2aaaab60772c] vm_insnhelper.c:3252
/usr/local/lib/libruby.so.3.2(vm_call_cfunc_with_frame) vm_insnhelper.c:3254
/usr/local/lib/libruby.so.3.2(vm_sendish+0xcc) [0x2aaaab60d35c] vm_insnhelper.c:5064
/usr/local/lib/libruby.so.3.2(vm_exec_core+0x1cf) [0x2aaaab6179bf] insns.def:801
/usr/local/lib/libruby.so.3.2(rb_vm_exec+0x5aa) [0x2aaaab61d38a] vm.c:2383
/usr/local/lib/libruby.so.3.2(rb_ec_exec_node+0xa8) [0x2aaaab432a78] eval.c:289
/usr/local/lib/libruby.so.3.2(ruby_run_node+0x93) [0x2aaaab438cd3] eval.c:330
/usr/local/bin/ruby(rb_main+0x21) [0x555555557113] ./main.c:38
/usr/local/bin/ruby(main) ./main.c:57
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea) [0x2aaaabba6d0a]
[0x55555555715a]

-- Other runtime information -----------------------------------------------

* Loaded script: solid-queue-worker(1.1.0): waiting for jobs in *

* Loaded features:

    0 enumerator.so
    1 thread.rb
    2 fiber.so
    3 rational.so
    4 complex.so
    5 ruby2_keywords.rb
    6 /usr/local/lib/ruby/3.2.0/x86_64-linux/enc/encdb.so
    7 /usr/local/lib/ruby/3.2.0/x86_64-linux/enc/trans/transdb.so
    8 /usr/local/lib/ruby/3.2.0/x86_64-linux/rbconfig.rb
    9 /usr/local/lib/ruby/3.2.0/rubygems/compatibility.rb
   10 /usr/local/lib/ruby/3.2.0/rubygems/defaults.rb
   11 /usr/local/lib/ruby/3.2.0/rubygems/deprecate.rb
   12 /usr/local/lib/ruby/3.2.0/rubygems/errors.rb
   13 /usr/local/lib/ruby/3.2.0/rubygems/unknown_command_spell_checker.rb
   14 /usr/local/lib/ruby/3.2.0/rubygems/exceptions.rb
   15 /usr/local/lib/ruby/3.2.0/rubygems/basic_specification.rb
   16 /usr/local/lib/ruby/3.2.0/rubygems/stub_specification.rb
   17 /usr/local/lib/ruby/3.2.0/rubygems/platform.rb
   18 /usr/local/lib/ruby/3.2.0/rubygems/util/list.rb
   19 /usr/local/lib/ruby/3.2.0/rubygems/version.rb
   20 /usr/local/lib/ruby/3.2.0/rubygems/requirement.rb
   21 /usr/local/lib/ruby/3.2.0/rubygems/specification.rb
   22 /usr/local/lib/ruby/3.2.0/rubygems/util.rb
   23 /usr/local/lib/ruby/3.2.0/rubygems/dependency.rb
   24 /usr/local/lib/ruby/3.2.0/rubygems/core_ext/kernel_gem.rb
   25 /usr/local/lib/ruby/3.2.0/x86_64-linux/monitor.so
   26 /usr/local/lib/ruby/3.2.0/monitor.rb
   27 /usr/local/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb
   28 /usr/local/lib/ruby/3.2.0/rubygems/core_ext/kernel_warn.rb
   29 /usr/local/lib/ruby/3.2.0/rubygems.rb
   30 /usr/local/lib/ruby/3.2.0/rubygems/path_support.rb
   31 /usr/local/lib/ruby/3.2.0/error_highlight/version.rb
   32 /usr/local/lib/ruby/3.2.0/error_highlight/base.rb
   33 /usr/local/lib/ruby/3.2.0/error_highlight/formatter.rb
   34 /usr/local/lib/ruby/3.2.0/error_highlight/core_ext.rb
   35 /usr/local/lib/ruby/3.2.0/error_highlight.rb
   36 /usr/local/lib/ruby/3.2.0/did_you_mean/version.rb
   37 /usr/local/lib/ruby/3.2.0/did_you_mean/core_ext/name_error.rb
   38 /usr/local/lib/ruby/3.2.0/did_you_mean/levenshtein.rb
   39 /usr/local/lib/ruby/3.2.0/did_you_mean/jaro_winkler.rb
   40 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checker.rb
   41 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/name_error_checkers/class_name_checker.rb
   42 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/name_error_checkers/variable_name_checker.rb
   43 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/name_error_checkers.rb
   44 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/method_name_checker.rb
   45 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/key_error_checker.rb
   46 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/null_checker.rb
   47 /usr/local/lib/ruby/3.2.0/did_you_mean/tree_spell_checker.rb
   48 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/require_path_checker.rb
   49 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/pattern_key_name_checker.rb
   50 /usr/local/lib/ruby/3.2.0/did_you_mean/formatter.rb
   51 /usr/local/lib/ruby/3.2.0/did_you_mean.rb
   52 /usr/local/lib/ruby/3.2.0/syntax_suggest/core_ext.rb
   53 /usr/local/lib/ruby/3.2.0/rubygems/bundler_version_finder.rb
   54 /usr/local/lib/ruby/3.2.0/x86_64-linux/pathname.so
   55 /usr/local/lib/ruby/3.2.0/pathname.rb
   56 /usr/local/lib/ruby/3.2.0/bundler/version.rb
   57 /usr/local/lib/ruby/3.2.0/bundler/constants.rb
   58 /usr/local/lib/ruby/3.2.0/bundler/rubygems_integration.rb
   59 /usr/local/lib/ruby/3.2.0/bundler/current_ruby.rb
   60 /usr/local/lib/ruby/3.2.0/bundler/shared_helpers.rb
   61 /usr/local/lib/ruby/3.2.0/bundler/vendor/fileutils/lib/fileutils.rb
   62 /usr/local/lib/ruby/3.2.0/bundler/vendored_fileutils.rb
   63 /usr/local/lib/ruby/3.2.0/bundler/errors.rb
   64 /usr/local/lib/ruby/3.2.0/bundler/environment_preserver.rb
   65 /usr/local/lib/ruby/3.2.0/bundler/plugin/api.rb
   66 /usr/local/lib/ruby/3.2.0/bundler/plugin.rb
   67 /usr/local/lib/ruby/3.2.0/rubygems/text.rb
   68 /usr/local/lib/ruby/3.2.0/rubygems/source/git.rb
   69 /usr/local/lib/ruby/3.2.0/rubygems/source/installed.rb
   70 /usr/local/lib/ruby/3.2.0/rubygems/source/specific_file.rb
   71 /usr/local/lib/ruby/3.2.0/rubygems/source/local.rb
   72 /usr/local/lib/ruby/3.2.0/rubygems/source/lock.rb
   73 /usr/local/lib/ruby/3.2.0/rubygems/source/vendor.rb
   74 /usr/local/lib/ruby/3.2.0/rubygems/source.rb
   75 /usr/local/lib/ruby/3.2.0/bundler/match_metadata.rb
   76 /usr/local/lib/ruby/3.2.0/bundler/force_platform.rb
   77 /usr/local/lib/ruby/3.2.0/bundler/gem_helpers.rb
   78 /usr/local/lib/ruby/3.2.0/bundler/match_platform.rb
   79 /usr/local/lib/ruby/3.2.0/bundler/rubygems_ext.rb
   80 /usr/local/lib/ruby/3.2.0/bundler/build_metadata.rb
   81 /usr/local/lib/ruby/3.2.0/bundler.rb
   82 /usr/local/lib/ruby/3.2.0/bundler/ui.rb
   83 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/command.rb
   84 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/core_ext/hash_with_indifferent_access.rb
   85 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/error.rb
   86 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/invocation.rb
   87 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/nested_context.rb
   88 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/parser/argument.rb
   89 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/parser/arguments.rb
   90 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/parser/option.rb
   91 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/parser/options.rb
   92 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/parser.rb
   93 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/shell.rb
   94 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/line_editor/basic.rb
   95 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/line_editor/readline.rb
   96 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/line_editor.rb
   97 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/util.rb
   98 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/base.rb
   99 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor.rb
  100 /usr/local/lib/ruby/3.2.0/bundler/vendored_thor.rb
  101 /usr/local/lib/ruby/3.2.0/bundler/ui/shell.rb
  102 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/shell/basic.rb
  103 /usr/local/lib/ruby/3.2.0/bundler/vendor/thor/lib/thor/shell/color.rb
  104 /usr/local/lib/ruby/3.2.0/rubygems/user_interaction.rb
  105 /usr/local/lib/ruby/3.2.0/rubygems/ext/builder.rb
  106 /usr/local/lib/ruby/3.2.0/bundler/ui/rg_proxy.rb
  107 /usr/local/lib/ruby/3.2.0/bundler/settings.rb
  108 /usr/local/lib/ruby/3.2.0/bundler/feature_flag.rb
  ...
<<<<<<<<  logs go all the way to the model>>>>>
  ...
 2999 /u/apps/api/app/models/color.rb
 3000 /u/apps/api/app/models/article.rb
 3001 /u/apps/api/app/models/color_group.rb

* Process memory map:

2aaaaaaab000-2aaaaaaac000 ---p 00000000 00:00 0
2aaaaaaac000-2aaaab2ac000 rw-p 00000000 00:00 0                          [stack]
2aaaab2ac000-2aaaab2ad000 r--p 00000000 00:53 470176                     /lib/x86_64-linux-gnu/ld-2.31.so
2aaaab2ad000-2aaaab2cd000 r-xp 00001000 00:53 470176                     /lib/x86_64-linux-gnu/ld-2.31.so
2aaaab2cd000-2aaaab2d5000 r--p 00021000 00:53 470176                     /lib/x86_64-linux-gnu/ld-2.31.so
2aaaab2d5000-2aaaab2d6000 ---p 00000000 00:00 0
..
..
..
2aab4bf70000-2aab4bf80000 rw-p 00000000 00:00 0
2aab4bf88000-2aab4bfbb000 r--s 00000000 00:53 2836918                    /usr/local/bin/ruby
2aab4bfbb000-2aab4d133000 r--s 00000000 00:53 2964141                    /usr/local/lib/libruby.so.3.2.0
2aab4d133000-2aab4e542000 rw-p 00000000 00:00 0
2aab4e542000-2aab4e567000 r--s 00000000 00:53 470233                     /lib/x86_64-linux-gnu/libpthread-2.31.so
2aab4e567000-2aab4e739000 r--s 00000000 00:53 470188                     /lib/x86_64-linux-gnu/libc-2.31.so
555555556000-555555557000 r--p 00000000 00:53 2836918                    /usr/local/bin/ruby
555555557000-555555558000 r-xp 00001000 00:53 2836918                    /usr/local/bin/ruby
555555558000-555555559000 r--p 00002000 00:53 2836918                    /usr/local/bin/ruby
555555559000-55555555a000 r--p 00002000 00:53 2836918                    /usr/local/bin/ruby
55555555a000-55555555b000 rw-p 00003000 00:53 2836918                    /usr/local/bin/ruby
55555555b000-555559506000 rw-p 00000000 00:00 0
555559506000-55555986e000 rw-p 00000000 00:00 0
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]


qemu: uncaught target signal 6 (Aborted) - core dumped

@rosa
Copy link
Member

rosa commented Jan 22, 2025

Oh! @tahirwaseer, that's very interesting because you're hitting a different error (besides this one), that SEGFAULT in the supervisor. Do you know where that comes from exactly? It needs to come from either Ruby or a gem that uses C extensions. Wondering if the SEGFAULT is causing the other bug... 🤔

@tahirwaseer
Copy link

These are almost all the logs. other didn't look relevant. It keeps processing the jobs and after a while i get this. Haven't figured out the exact reason.

One more thing about the setup:

I've this in my puma.rb:
max_threads_count = ENV.fetch("RAILS_MAX_THREADS") { 5 }

And I'm running four workers with three threads each which exceeds puma threads. Not sure if this could cause anything like that.

Got another one:

2aad22750000-2aad22760000 rw-p 00000000 00:00 0
2aad22770000-2aad22780000 rw-p 00000000 00:00 0
2aad22790000-2aad227a0000 rw-p 00000000 00:00 0
2aad227b0000-2aad227c0000 rw-p 00000000 00:00 0
2aad227d0000-2aad227e0000 rw-p 00000000 00:00 0
2aad227e7000-2aad2281a000 r--s 00000000 00:53 2836918 /usr/local/bin/ruby
2aad2281a000-2aad23992000 r--s 00000000 00:53 2964141 /usr/local/lib/libruby.so.3.2.0
2aad23992000-2aad239b7000 r--s 00000000 00:53 470233 /lib/x86_64-linux-gnu/libpthread-2.31.so
2aad239b7000-2aad23b89000 r--s 00000000 00:53 470188 /lib/x86_64-linux-gnu/libc-2.31.so
555555556000-555555557000 r--p 00000000 00:53 2836918 /usr/local/bin/ruby
555555557000-555555558000 r-xp 00001000 00:53 2836918 /usr/local/bin/ruby
555555558000-555555559000 r--p 00002000 00:53 2836918 /usr/local/bin/ruby
555555559000-55555555a000 r--p 00002000 00:53 2836918 /usr/local/bin/ruby
55555555a000-55555555b000 rw-p 00003000 00:53 2836918 /usr/local/bin/ruby
55555555b000-5555596d0000 rw-p 00000000 00:00 0
5555596d0000-555559ec0000 rw-p 00000000 00:00 0
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]

qemu: uncaught target signal 6 (Aborted) - core dumped
Time taken to download and set image for product_image f896458d-425f-4412-aed2-aa94b3c8847a: 4.836300586 seconds
Time taken to download and set image for product_image 5bf46c41-1170-485a-a0df-1e69c82600a6: 4.656792128 seconds
Time taken to download and set image for product_image 87420dce-35d8-49d6-8a1b-51f3c67176e6: 4.735127419 seconds
/usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:174:in handle_claimed_jobs_by': undefined method supervisees' for nil:NilClass (NoMethodError)

    if registered_process = process.supervisees.find_by(name: terminated_fork.name)
                                   ^^^^^^^^^^^^
from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:166:in `block in replace_fork'
from /usr/local/bundle/gems/activesupport-7.1.2/lib/active_support/notifications.rb:206:in `block in instrument'
from /usr/local/bundle/gems/activesupport-7.1.2/lib/active_support/notifications/instrumenter.rb:58:in `instrument'
from /usr/local/bundle/gems/activesupport-7.1.2/lib/active_support/notifications.rb:206:in `instrument'
from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue.rb:67:in `instrument'
from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:163:in `replace_fork'
from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:143:in `block in reap_and_replace_terminated_forks'
from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:139:in `loop'
from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:139:in `reap_and_replace_terminated_forks'
from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:67:in `block in supervise'
from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:60:in `loop'
from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:60:in `supervise'
from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:36:in `start'
from <internal:kernel>:90:in `tap'
from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/supervisor.rb:14:in `start'
from /usr/local/bundle/gems/solid_queue-1.1.0/lib/solid_queue/cli.rb:26:in `start'
from /usr/local/bundle/gems/thor-1.3.2/lib/thor/command.rb:28:in `run'
from /usr/local/bundle/gems/thor-1.3.2/lib/thor/invocation.rb:127:in `invoke_command'
from /usr/local/bundle/gems/thor-1.3.2/lib/thor.rb:538:in `dispatch'
from /usr/local/bundle/gems/thor-1.3.2/lib/thor/base.rb:584:in `start'
from bin/jobs:6:in `<main>'

@tahirwaseer
Copy link

tahirwaseer commented Jan 22, 2025

I'm running on latest solid_queue and have simplified the configuration.

  • increased max thread count on puma to 12
  • set worker threads to 8
  • running only one worker

The docs are not super clear but I think solid_queue needs two additional threads to run scheduler etc. other than the worker process. Which should correlate with puma threads. Now puma has enough capacity to handle everything. = ??

This is the most frequent error that I'm getting. something linked to concurrency.

Time taken to download and set image for product_image cc4e633c-ea2d-450a-81ad-bd885983c099: 2.324035917 seconds
Time taken to download and set image for product_image e78bb218-8ac0-4233-9fb8-6119bc418f7c: 2.790597126 seconds
Time taken to download and set image for product_image fc0af46d-cd3f-4df4-af64-c8ab21c42d51: 2.981870043 seconds
/usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48: [BUG] Segmentation fault at 0x0000000000000069
ruby 3.2.0 (2022-12-25 revision a528908271) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0012 p:---- s:0063 e:000062 CFUNC  :synchronize
c:0011 p:0016 s:0059 e:000058 METHOD /usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48
c:0010 p:0014 s:0055 e:000054 METHOD /usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/executor/safe_task_executor.rb:22
c:0009 p:0028 s:0047 e:000046 METHOD /usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/ivar.rb:170
c:0008 p:0009 s:0038 e:000037 BLOCK  /usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/future.rb:55
c:0007 p:0008 s:0035 e:000034 METHOD /usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:359
c:0006 p:0041 s:0027 e:000026 BLOCK  /usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:350 [FINISH]
c:0005 p:---- s:0021 e:000020 CFUNC  :loop
c:0004 p:0004 s:0017 e:000016 BLOCK  /usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:341 [FINISH]
c:0003 p:---- s:0014 e:000013 CFUNC  :catch
c:0002 p:0006 s:0009 e:000008 BLOCK  /usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:340 [FINISH]
c:0001 p:---- s:0003 e:000002 DUMMY  [FINISH]

-- Ruby level backtrace information ----------------------------------------
/usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:340:in `block in create_worker'
/usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:340:in `catch'
/usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:341:in `block (2 levels) in create_worker'
/usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:341:in `loop'
/usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:350:in `block (3 levels) in create_worker'
/usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:359:in `run_task'
/usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/future.rb:55:in `block in execute'
/usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/ivar.rb:170:in `safe_execute'
/usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/executor/safe_task_executor.rb:22:in `execute'
/usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48:in `synchronize'
/usr/local/bundle/gems/concurrent-ruby-1.3.4/lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48:in `synchronize'

-- Machine register context ------------------------------------------------
 RIP: 0x00002aaaab5cab78 RBP: 0x0000555558619aa0 RSP: 0x00002aaaed57cc40
 RAX: 0x0000000000000061 RBX: 0x00002aaaf0c7ff40 RCX: 0x0000555559206420
 RDX: 0x0000000000000061 RDI: 0x00002aaaf0c7ff40 RSI: 0x0000555558619aa0
  R8: 0x00002aaabc59e048  R9: 0x0000000000000001 R10: 0x0000555556ca8b80
 R11: 0x0000000000000000 R12: 0x00002aaaf0c7ff50 R13: 0x00002aab2b385148
 R14: 0x00002aaaed67d910 R15: 0x00005555555a49e0 EFL: 0x0000000000000212

-- C level backtrace information -------------------------------------------
/usr/local/lib/libruby.so.3.2(rb_print_backtrace+0x11) [0x2aaaab63258f] vm_dump.c:785
/usr/local/lib/libruby.so.3.2(rb_vm_bugreport) vm_dump.c:1080
/usr/local/lib/libruby.so.3.2(rb_bug_for_fatal_signal+0xf0) [0x2aaaab42d8c0] error.c:813
/usr/local/lib/libruby.so.3.2(sigsegv+0x49) [0x2aaaab5847d9] signal.c:964
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x2aaaabb74140]
/usr/local/lib/libruby.so.3.2(thread_mutex_remove+0x1f) [0x2aaaab5cab78] thread_sync.c:224
/usr/local/lib/libruby.so.3.2(rb_mutex_unlock_th) thread_sync.c:474
/usr/local/lib/libruby.so.3.2(rb_mutex_unlock+0x4e) [0x2aaaab5cacee] thread_sync.c:443
/usr/local/lib/libruby.so.3.2(rb_ensure+0x171) [0x2aaaab437091] eval.c:1015
/usr/local/lib/libruby.so.3.2(vm_cfp_consistent_p+0x0) [0x2aaaab60772c] vm_insnhelper.c:3252
/usr/local/lib/libruby.so.3.2(vm_call_cfunc_with_frame) vm_insnhelper.c:3254
/usr/local/lib/libruby.so.3.2(vm_sendish+0xcc) [0x2aaaab60d35c] vm_insnhelper.c:5064
/usr/local/lib/libruby.so.3.2(vm_exec_core+0x1cf) [0x2aaaab6179bf] insns.def:801
/usr/local/lib/libruby.so.3.2(rb_vm_exec+0xbb) [0x2aaaab61ce9b] vm.c:2374
/usr/local/lib/libruby.so.3.2(invoke_block_from_c_bh+0x139) [0x2aaaab61e363] vm.c:1398
/usr/local/lib/libruby.so.3.2(vm_yield_with_cref) vm.c:1509
/usr/local/lib/libruby.so.3.2(vm_yield) vm.c:1517
/usr/local/lib/libruby.so.3.2(rb_yield_0) vm_eval.c:1348
/usr/local/lib/libruby.so.3.2(loop_i) vm_eval.c:1447
/usr/local/lib/libruby.so.3.2(rb_vrescue2+0x10b) [0x2aaaab436abb] eval.c:917
/usr/local/lib/libruby.so.3.2(rb_rescue2+0x8a) [0x2aaaab436d2a] eval.c:898
/usr/local/lib/libruby.so.3.2(vm_cfp_consistent_p+0x0) [0x2aaaab60772c] vm_insnhelper.c:3252
/usr/local/lib/libruby.so.3.2(vm_call_cfunc_with_frame) vm_insnhelper.c:3254
/usr/local/lib/libruby.so.3.2(vm_sendish+0xcc) [0x2aaaab60d35c] vm_insnhelper.c:5064
/usr/local/lib/libruby.so.3.2(vm_exec_core+0x1cf) [0x2aaaab6179bf] insns.def:801
/usr/local/lib/libruby.so.3.2(rb_vm_exec+0xbb) [0x2aaaab61ce9b] vm.c:2374
/usr/local/lib/libruby.so.3.2(catch_i+0x2bb) [0x2aaaab61e9db] vm.c:1398
/usr/local/lib/libruby.so.3.2(vm_catch_protect+0xe2) [0x2aaaab6102e2] vm_eval.c:2393
/usr/local/lib/libruby.so.3.2(rb_catch_obj+0x4d) [0x2aaaab61047d] vm_eval.c:2419
/usr/local/lib/libruby.so.3.2(vm_cfp_consistent_p+0x0) [0x2aaaab60772c] vm_insnhelper.c:3252
/usr/local/lib/libruby.so.3.2(vm_call_cfunc_with_frame) vm_insnhelper.c:3254
/usr/local/lib/libruby.so.3.2(vm_sendish+0xcc) [0x2aaaab60d35c] vm_insnhelper.c:5064
/usr/local/lib/libruby.so.3.2(vm_exec_core+0x1cf) [0x2aaaab6179bf] insns.def:801
/usr/local/lib/libruby.so.3.2(rb_vm_exec+0xbb) [0x2aaaab61ce9b] vm.c:2374
/usr/local/lib/libruby.so.3.2(rb_vm_invoke_proc+0x57) [0x2aaaab622397] vm.c:1603
/usr/local/lib/libruby.so.3.2(thread_do_start_proc+0x165) [0x2aaaab5d2385] thread.c:595
/usr/local/lib/libruby.so.3.2(thread_do_start+0x14) [0x2aaaab5d29f4] thread.c:614
/usr/local/lib/libruby.so.3.2(thread_start_func_2) thread.c:689
/usr/local/lib/libruby.so.3.2(thread_start_func_1+0xe8) [0x2aaaab5d3468] thread_pthread.c:1170
/lib/x86_64-linux-gnu/libpthread.so.0(start_thread+0xd7) [0x2aaaabb68ea7]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x2aaaabc7fa2f]

-- Other runtime information -----------------------------------------------

* Loaded script: solid-queue-worker(1.1.0): waiting for jobs in *

* Loaded features:

    0 enumerator.so
    1 thread.rb
    2 fiber.so
    3 rational.so
    4 complex.so
    5 ruby2_keywords.rb
    6 /usr/local/lib/ruby/3.2.0/x86_64-linux/enc/encdb.so
    7 /usr/local/lib/ruby/3.2.0/x86_64-linux/enc/trans/transdb.so
    8 /usr/local/lib/ruby/3.2.0/x86_64-linux/rbconfig.rb
    9 /usr/local/lib/ruby/3.2.0/rubygems/compatibility.rb
   10 /usr/local/lib/ruby/3.2.0/rubygems/defaults.rb
   11 /usr/local/lib/ruby/3.2.0/rubygems/deprecate.rb
   12 /usr/local/lib/ruby/3.2.0/rubygems/errors.rb
   13 /usr/local/lib/ruby/3.2.0/rubygems/unknown_command_spell_checker.rb
   14 /usr/local/lib/ruby/3.2.0/rubygems/exceptions.rb
   15 /usr/local/lib/ruby/3.2.0/rubygems/basic_specification.rb
   16 /usr/local/lib/ruby/3.2.0/rubygems/stub_specification.rb
   17 /usr/local/lib/ruby/3.2.0/rubygems/platform.rb
   18 /usr/local/lib/ruby/3.2.0/rubygems/util/list.rb
   19 /usr/local/lib/ruby/3.2.0/rubygems/version.rb
   20 /usr/local/lib/ruby/3.2.0/rubygems/requirement.rb
   21 /usr/local/lib/ruby/3.2.0/rubygems/specification.rb
   22 /usr/local/lib/ruby/3.2.0/rubygems/util.rb
   23 /usr/local/lib/ruby/3.2.0/rubygems/dependency.rb
   24 /usr/local/lib/ruby/3.2.0/rubygems/core_ext/kernel_gem.rb
   25 /usr/local/lib/ruby/3.2.0/x86_64-linux/monitor.so
   26 /usr/local/lib/ruby/3.2.0/monitor.rb
   27 /usr/local/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb
   28 /usr/local/lib/ruby/3.2.0/rubygems/core_ext/kernel_warn.rb
   29 /usr/local/lib/ruby/3.2.0/rubygems.rb
   30 /usr/local/lib/ruby/3.2.0/rubygems/path_support.rb
   31 /usr/local/lib/ruby/3.2.0/error_highlight/version.rb
   32 /usr/local/lib/ruby/3.2.0/error_highlight/base.rb
   33 /usr/local/lib/ruby/3.2.0/error_highlight/formatter.rb
   34 /usr/local/lib/ruby/3.2.0/error_highlight/core_ext.rb
   35 /usr/local/lib/ruby/3.2.0/error_highlight.rb
   36 /usr/local/lib/ruby/3.2.0/did_you_mean/version.rb
   37 /usr/local/lib/ruby/3.2.0/did_you_mean/core_ext/name_error.rb
   38 /usr/local/lib/ruby/3.2.0/did_you_mean/levenshtein.rb
   39 /usr/local/lib/ruby/3.2.0/did_you_mean/jaro_winkler.rb
   40 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checker.rb
   41 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/name_error_checkers/class_name_checker.rb
   42 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/name_error_checkers/variable_name_checker.rb
   43 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/name_error_checkers.rb
   44 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/method_name_checker.rb
   45 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/key_error_checker.rb
   46 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/null_checker.rb
   47 /usr/local/lib/ruby/3.2.0/did_you_mean/tree_spell_checker.rb
   48 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/require_path_checker.rb
   49 /usr/local/lib/ruby/3.2.0/did_you_mean/spell_checkers/pattern_key_name_checker.rb
   50 /usr/local/lib/ruby/3.2.0/did_you_mean/formatter.rb
   51 /usr/local/lib/ruby/3.2.0/did_you_mean.rb
   52 /usr/local/lib/ruby/3.2.0/syntax_suggest/core_ext.rb
   53 /usr/local/lib/ruby/3.2.0/rubygems/bundler_version_finder.rb
   54 /usr/local/lib/ruby/3.2.0/x86_64-linux/pathname.so
   55 /usr/local/lib/ruby/3.2.0/pathname.rb
   56 /usr/local/lib/ruby/3.2.0/bundler/version.rb
   57 /usr/local/lib/ruby/3.2.0/bundler/constants.rb
   58 /usr/local/lib/ruby/3.2.0/bundler/rubygems_integration.rb

2aab1842f000-2aab18462000 r--s 00000000 00:53 2836918                    /usr/local/bin/ruby
2aab18462000-2aab195da000 r--s 00000000 00:53 2964141                    /usr/local/lib/libruby.so.3.2.0
2aab195da000-2aab1a9e9000 rw-p 00000000 00:00 0
2aab1a9e9000-2aab1aa0e000 r--s 00000000 00:53 470233                     /lib/x86_64-linux-gnu/libpthread-2.31.so
2aab1aa0e000-2aab1abe0000 r--s 00000000 00:53 470188                     /lib/x86_64-linux-gnu/libc-2.31.so
555555556000-555555557000 r--p 00000000 00:53 2836918                    /usr/local/bin/ruby
555555557000-555555558000 r-xp 00001000 00:53 2836918                    /usr/local/bin/ruby
555555558000-555555559000 r--p 00002000 00:53 2836918                    /usr/local/bin/ruby
555555559000-55555555a000 r--p 00002000 00:53 2836918                    /usr/local/bin/ruby
55555555a000-55555555b000 rw-p 00003000 00:53 2836918                    /usr/local/bin/ruby
55555555b000-5555595fd000 rw-p 00000000 00:00 0
5555595fd000-5555596b7000 rw-p 00000000 00:00 0
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]


qemu: uncaught target signal 6 (Aborted) - core dumped
Time taken to download and set image for product_image 5da90ba8-245a-4d96-bc30-c987b8f0dd96: 9.167706615 seconds
Time taken to download and set image for product_image 7197028f-de77-40f9-8fd8-c90674d54e64: 9.36095599 seconds
Time taken to download and set image for product_image 6ba2e751-8ff8-49da-ae0d-18e4db7fb1a9: 9.379679782 seconds

@jakeonrails
Copy link

I am getting a similar crash. It's killing overmind - so my web server shuts down too.

I am not even processing any background jobs, just clicking around doing front-end dev work.

19:11:38 worker | /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:174:in `handle_claimed_jobs_by': undefined method `supervisees' for nil (NoMethodError)
19:11:38 worker |
19:11:38 worker |         if registered_process = process.supervisees.find_by(name: terminated_fork.name)
19:11:38 worker |                                        ^^^^^^^^^^^^
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:166:in `block in replace_fork'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/activesupport-8.0.1/lib/active_support/notifications.rb:210:in `block in instrument'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/activesupport-8.0.1/lib/active_support/notifications/instrumenter.rb:58:in `instrument'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/activesupport-8.0.1/lib/active_support/notifications.rb:210:in `instrument'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue.rb:67:in `instrument'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:163:in `replace_fork'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:143:in `block in reap_and_replace_terminated_forks'
19:11:38 worker |       from <internal:kernel>:187:in `loop'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:139:in `reap_and_replace_terminated_forks'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:67:in `block in supervise'
19:11:38 worker |       from <internal:kernel>:187:in `loop'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:60:in `supervise'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:36:in `start'
19:11:38 worker |       from <internal:kernel>:90:in `tap'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/supervisor.rb:14:in `start'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/solid_queue-1.1.2/lib/solid_queue/cli.rb:26:in `start'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/thor-1.3.2/lib/thor/command.rb:28:in `run'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/thor-1.3.2/lib/thor/invocation.rb:127:in `invoke_command'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/thor-1.3.2/lib/thor.rb:538:in `dispatch'
19:11:38 worker |       from /Users/jakemoffatt/.asdf/installs/ruby/3.3.1/lib/ruby/gems/3.3.0/gems/thor-1.3.2/lib/thor/base.rb:584:in `start'
19:11:38 worker |       from bin/jobs:6:in `<main>'

Vaguely I feel like this has happened during database migrations or when I have reseeded my database, but it's perplexing because this is a very small, brand new app with not a lot going on. Every once in a while I find myself having to start the server back up and it's because of this error.

@kofiasare-dev
Copy link

Image

Same here

@rosa
Copy link
Member

rosa commented Feb 3, 2025

Oh, @kofiasare-dev, thanks for that! That's interesting because you're running multiple supervisors in the same machine 🤔

Image

Did this happen when you woke up your computer after it had been suspended? I think the only way this could happen is if, somehow, the processes are suspended and fail to send heartbeats, then suddenly wake up and they're dead and so their registered processes get wiped up under them in an ungrateful way. This wouldn't happen in a real production setting.

@tahirwaseer
Copy link

just to confirm @rosa. I've been running the server and job processes on the same container. maybe that could explain it?

I can spin up a separate container for processing jobs and test there; that should replicate close to production env.

@rosa
Copy link
Member

rosa commented Feb 3, 2025

@tahirwaseer, no, that should be ok, what I mean is that in production usually your processes won't go "to sleep" like if you have them running in your computer that might be suspended after some time inactive 😅

@kofiasare-dev
Copy link

Oh, @kofiasare-dev, thanks for that! That's interesting because you're running multiple supervisors in the same machine 🤔

Image Did this happen when you woke up your computer after it had been suspended? I think the only way this could happen is if, somehow, the processes are suspended and fail to send heartbeats, then suddenly wake up and they're dead and so their registered processes get wiped up under them in an ungrateful way. This wouldn't happen in a real production setting.

Thanks @rosa.

Yes, this happened right after waking my computer from suspension. I agree this may not happen in production.
Any ideas on preventing this in development?"

@tahirwaseer
Copy link

@tahirwaseer, no, that should be ok, what I mean is that in production usually your processes won't go "to sleep" like if you have them running in your computer that might be suspended after some time inactive 😅

Yes, that is a likely explanation. Will be testing in staging env soon.

@davidalejandroaguilar
Copy link

davidalejandroaguilar commented Feb 8, 2025

For me, this happens when running Playwright tests via https://github.com/shakacode/cypress-playwright-on-rails, running with 1 worker and fullyParallel set to false. Yet running 1 test runs it in 3 different browsers, so it runs it 3 times.

I'm using DatabaseCleaner with the truncation strategy. If I exclude Solid Queue tables, I no longer get this error. e.g.

  DatabaseCleaner.strategy = [:truncation, except: %w[
    solid_queue_ready_executions
    solid_queue_recurring_tasks
    solid_queue_claimed_executions
    solid_queue_scheduled_executions
    solid_queue_pauses
    solid_queue_semaphores
    solid_queue_blocked_executions
    solid_queue_failed_executions
    solid_queue_jobs
    solid_queue_processes
  ]]

Haven't dug at all yet in the solid_queue source code, but I wonder if solid_queue_processes has something to do with it. e.g. registering the process, then getting truncated, then searching for it and not finding it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants