Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker just stop randomly #510

Open
hirogeek opened this issue Feb 5, 2025 · 1 comment
Open

Worker just stop randomly #510

hirogeek opened this issue Feb 5, 2025 · 1 comment

Comments

@hirogeek
Copy link

hirogeek commented Feb 5, 2025

Hi,

Since I move my app to new server, I see sold_queue stop working without reason

If i do /bin/systemctl --user status MyApp_solid_queue_production I get

solid_queue:status
01 /bin/systemctl --user status MyApp_solid_queue_production
01 ● MyApp_solid_queue_production.service - SolidQueue background job
01 Loaded: loaded (/home/ubuntu/.config/systemd/user/MyApp_solid_queue_production.service; enabled; preset: enabled)
01 Active: active (running) since Tue 2025-02-04 07:05:35 CET; 668ms ago
01 Main PID: 1083322 (bundle)
01 Tasks: 2 (limit: 9488)
01 Memory: 35.3M (peak: 35.3M)
01 CPU: 668ms
01 CGroup: /user.slice/user-1000.slice/[email protected]/app.slice/MyApp_solid_queue_production.service
01 └─1083322 "/home/ubuntu/MyApp/shared/bundle/ruby/3.4.0/bin/rake solid_queue:start"
01
01 Feb 04 07:05:35 ov-3d7871 systemd[1083308]: Started MyApp_solid_queue_production.service - SolidQueue background job.
✔ 01 [email protected] 0.134s

Service seems ok, but without supervisor, dispatcher, worker. Sometimes I must start 10 times and the worker is running 1 minutes, sometime the worker running more than 3 hours

I got nothing in solde_queue.log or my production log.

Any idea ? or how can I get more log ?

my condig

default: &default
  dispatchers:
    - polling_interval: 1
      batch_size: 500
  workers:
    - queues: "*"
      threads: 3
      processes: <%= ENV.fetch("JOB_CONCURRENCY", 1) %>
      polling_interval: 0.1

My service is :

[Unit]
Description=SolidQueue background job
After=syslog.target network.target

[Service]
Type=simple
Environment='RAILS_ENV=production'
Environment='RAILS_ENV=production'
Environment='RBENV_ROOT=$HOME/.rbenv'
Environment='RBENV_VERSION=3.4.1'
WorkingDirectory=/home/ubuntu/Logytask/current
ExecStart=/home/ubuntu/.rbenv/shims/bundle exec rake solid_queue:start
ExecReload=/bin/kill -TSTP $MAINPID
ExecStop=/bin/kill -TERM $MAINPID
Environment=MALLOC_ARENA_MAX=2

RestartSec=1
Restart=on-failure

StandardOutput=append:/home/ubuntu/Logytask/shared/log/solid_queue.log
StandardError=append:/home/ubuntu/Logytask/shared/log/solid_queue.log

SyslogIdentifier=Logytask_solid_queue_production

[Install]
WantedBy=default.target

I use Ruby 3.4.1, SolidQueue 1.1.3, Rails 8.0.1
I have config.solid_queue.silence_polling = false

@yoka
Copy link

yoka commented Feb 6, 2025

edit: Actually here's some more findings after being able to do some more backtracking on this:

TLDR: try removing "exec" part of the command that runs the rails process (or any command like foreman that runs the rails process)

Running in docker (at least alpine based container) + sh (not sure if this happens in bash also) + starting the rails process with anything that uses exec command will make the planets align in such way that your whole rails service might get shut down by seemingly unrelated process execution you do in the container.
So running anything like tailwindcss watchers/builders, ferrum, specific test runners like cucumber can trigger this shutdown scenario. It might just look like it's solidqueue related as solidqueue will definitely inform you by writing to logs that "I'm going on vacay now bye".
It can happen even when doing docker compose execs in to the rails container service. Without knowing much of a dockers internals I assume it's somewhat along the lines it forks the "main" service process and if it's started with exec then it's the rails process and signals somehow end up to some of the traps in it.

Just check that whatever you start the rails command with does not run exec to achieve that. For example the default bin/dev does this. Remove the exec part of the command and see if it helps.

Of course this means the optimization of not having extra process running (as exec replaces the original one) is then lost, but frankly having a small shell process blob somewhere in the memory is not that big of a deal.

I hope this helps people to the right direction. This was a really frustrating issue to figure out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants