Failed job not retry #2263
Replies: 6 comments
-
When a job is going to be retried it is moved to the delayed set, check in there to see indeed if the job is delayed and for how long. If it is not in the delayed set, then something else is going on as it has not even been considered for retrying. |
Beta Was this translation helpful? Give feedback.
-
it happened again in last few hours. failed jobs not retry What I notice is, if I restart my workers, then those jobs will be retried immediately on start up, things go right. But next time, when new job gets failed, it'll happen again.. @manast do you have any idea? |
Beta Was this translation helpful? Give feedback.
-
I think the issue is this: OptimalBits/bull#1766 This is my code: import { Worker } from "bullmq";
import logger from "../../src/lib/logger";
import { IFetchOrderTrackingJob, QUEUES } from "../../src/lib/queues/types";
import { fetchOrderTrackingAndUpsert } from "../../src/utils/backend";
const worker = new Worker<IFetchOrderTrackingJob, boolean>(
QUEUES.FETCH_ORDER_TRACKING_INFO,
async (job) => {
const { trackingNumber, orderId } = job.data;
const result = await fetchOrderTrackingAndUpsert(orderId, trackingNumber);
if (!result) {
throw new Error(`[NEED_RETRY] Queue ${QUEUES.FETCH_ORDER_TRACKING_INFO}. Order Id: ${job.data.orderId}, Tracking Number: ${job.data.trackingNumber}`)
}
return result;
},
{
connection: {
host: process.env.REDIS_HOST,
port: parseInt(process.env.REDIS_PORT || "6379"),
},
}
);
worker.on("completed", async (job, result, prev) => {
logger.info(
`[COMPLETED] Queue ${QUEUES.FETCH_ORDER_TRACKING_INFO}. Order Id: ${job.data.orderId}, Tracking Number: ${job.data.trackingNumber}, Result: ${result}`
);
});
worker.on("failed", (job, error, prev) => {
logger.info(
`[FAILED] Queue ${QUEUES.FETCH_ORDER_TRACKING_INFO}. Order Id: ${job?.data.orderId}, Tracking Number: ${job?.data.trackingNumber}, Error message: ${error.message}`
);
logger.error(error.stack);
}); When error is thrown, it never retries |
Beta Was this translation helpful? Give feedback.
-
I notice that jobs are stuck at
I have 10 workers, and above shows 10 active jobs, but they never run and not finish |
Beta Was this translation helpful? Give feedback.
-
I check and see that all those hang |
Beta Was this translation helpful? Give feedback.
-
I found the root cause. Added timeout for HTTP request to external API, and it's fine now. Thanks @manast |
Beta Was this translation helpful? Give feedback.
-
We're running Bull in production. In there our job maybe failed. And to ensure the job will be re-processed, below is our config:
But we notice that our failed jobs are not getting retried.
This is details of a failed job:
As you can see
processedOn=1699153204847
which is2023-11-05T03:00:04.847Z
, now is2023-11-05T14:53:19.046Z
(more than 10 hours) andattemptsMade
still1
. I checked my server log, and confirm the job is not retriedI'm looking for help since this is our production and this is really serious. I'm using
bullmq@^4.11.4
Thanks community!
Beta Was this translation helpful? Give feedback.
All reactions