DEV: Log a warning message when a MiniScheduler scheduled job is stuck (#28258)

This commit adds a `MiniSchedulerLongRunningJobLogger` class which will
poll every 60 seconds for mini_scheduler jobs which are stuck. When it
detects that a job is stuck, it will log a warning message with the
current backtrace of the thread that is executing the job.

Note that for scheduled jobs which are executed at a frequency of less
than 30 minutes, we will log when the job has been executing for 30
minutes.

For scheduled jobs executed at a frequency of less than 2 hours, we will
log when the job has been executing for a duration greater than its
specified frequency.

For scheduled jobs executed at a frequency greater than 2 hours, we will
log as long as the job has been executing for more than 2 hours.
This commit is contained in:
Alan Guo Xiang Tan
2024-08-08 12:20:16 +08:00
committed by GitHub
parent 814d2e6286
commit 4c0af24173
4 changed files with 231 additions and 1 deletions

View File

@@ -3,6 +3,7 @@
require "sidekiq/pausable"
require "sidekiq_logster_reporter"
require "sidekiq_long_running_job_logger"
require "mini_scheduler_long_running_job_logger"
Sidekiq.configure_client { |config| config.redis = Discourse.sidekiq_redis_config }
@@ -47,6 +48,11 @@ if Sidekiq.server?
if !scheduler_hostname || scheduler_hostname.split(",").include?(Discourse.os_hostname)
begin
MiniScheduler.start(workers: GlobalSetting.mini_scheduler_workers)
MiniSchedulerLongRunningJobLogger.new(
poll_interval_seconds:
ENV["DISCOURSE_MINI_SCHEDULER_LONG_RUNNING_JOB_LOGGER_POLL_INTERVAL_SECONDS"],
).start
rescue MiniScheduler::DistributedMutex::Timeout
sleep 5
retry