Node.js Cron Job Monitoring with node-cron and node-schedule

Your backup job ran fine for three weeks. Then it silently stopped, and you found out when a customer asked why their data wasn't there.

This is the fundamental problem with Node.js scheduled tasks: they fail without telling anyone. Both node-cron and node-schedule run entirely in-memory with no persistence, no alerting, and no visibility into whether jobs actually ran. When something breaks, the only signal is absence—and absence is easy to miss.

This guide covers how to build reliable cron jobs in Node.js. We'll look at how node-cron and node-schedule work, why they fail silently, and several approaches to monitoring—from simple logging to external services to building your own solution.

How node-cron and node-schedule Actually Work

Understanding the internals helps explain why these libraries fail the way they do.

node-cron runs a global timer that ticks every second. On each tick, it evaluates every registered cron expression to see if it matches the current time. If it matches, it runs your callback. The library stores tasks in an in-memory Map—nothing touches disk.

const cron = require('node-cron');

// Runs every hour at minute 0
cron.schedule('0 * * * *', () => {
  console.log('Hourly task running');
});

The API is minimal: schedule(), validate(), and task methods like start(), stop(), and destroy(). Version 4 added noOverlap to prevent concurrent executions and maxExecutions to limit total runs.

node-schedule takes a different approach. Instead of checking every second, it calculates the next execution time and sets a single timer. This is more efficient when you have many jobs. It also supports Date objects and RecurrenceRule for complex scheduling:

const schedule = require('node-schedule');

// Run once at a specific date/time
schedule.scheduleJob(new Date('2025-12-25T10:00:00'), () => {
  console.log('Christmas notification');
});

// Use RecurrenceRule for complex patterns
const rule = new schedule.RecurrenceRule();
rule.dayOfWeek = [1, 3, 5]; // Mon, Wed, Fri
rule.hour = 14;
rule.minute = 30;
rule.tz = 'America/New_York';

schedule.scheduleJob(rule, () => {
  console.log('MWF 2:30 PM Eastern');
});

node-schedule also emits events you can hook into:

const job = schedule.scheduleJob('0 * * * *', myTask);

job.on('run', () => console.log('Started'));
job.on('success', () => console.log('Completed'));
job.on('error', (err) => console.error('Failed:', err));

Which to choose? node-cron is simpler and has slightly more downloads. node-schedule is better when you need date-based scheduling, the RecurrenceRule API, or lifecycle events. For basic recurring tasks with cron expressions, either works fine.

Why Node.js Cron Jobs Fail Silently

Both libraries share fundamental limitations that lead to silent failures.

Unhandled Async Errors

This catches most developers. Your callback throws, but you never see the error:

cron.schedule('0 9 * * *', async () => {
  await sendDailyReport(); // If this rejects, nothing is logged
});

The cron library catches synchronous exceptions, but async rejections escape into the void. Since Node.js 15, unhandled rejections crash the process—but in earlier versions (or with certain flags), the job just stops running with no indication why.

The fix is explicit error handling in every job:

cron.schedule('0 9 * * *', async () => {
  try {
    await sendDailyReport();
  } catch (error) {
    console.error('Daily report failed:', error);
    // Send to error tracking service
    Sentry.captureException(error);
  }
});

Jobs Lost on Restart

When your Node.js process restarts—deploy, crash, server reboot—all scheduled jobs disappear. Jobs that should have run during downtime are skipped forever. There's no recovery mechanism because there's no persistence.

A deploy at 2:55 AM means your 3:00 AM backup job doesn't run. You won't know until someone checks.

Memory Leaks

node-cron has documented memory leaks in its task Map. If you're dynamically creating tasks (common in multi-tenant applications), memory grows unbounded. GitHub issues report 10MB/hour growth in affected versions.

The workaround uses named tasks that overwrite instead of accumulate:

// Each call overwrites the previous task
cron.schedule('* * * * *', task, { name: 'tenant-123-sync' });

Timezone and DST Problems

Daylight Saving Time breaks jobs scheduled between 1-3 AM:

Spring forward: 2:00-2:59 AM doesn't exist. Jobs scheduled then never run.
Fall back: 1:00-1:59 AM happens twice. Jobs may run twice or skip entirely.

Always set timezones explicitly:

cron.schedule('0 2 * * *', task, { timezone: 'UTC' });

// Or with node-schedule
const rule = new schedule.RecurrenceRule();
rule.hour = 2;
rule.tz = 'UTC';

For critical jobs, prefer UTC and times that avoid the DST transition window.

Job Overlap

A job scheduled every minute that takes 90 seconds creates overlapping executions. Resources get exhausted, database connections pile up, and race conditions emerge.

// node-cron v4+ has built-in protection
cron.schedule('* * * * *', longTask, { noOverlap: true });

// For node-schedule or older node-cron, track manually
let isRunning = false;

schedule.scheduleJob('* * * * *', async () => {
  if (isRunning) {
    console.warn('Previous execution still running, skipping');
    return;
  }
  isRunning = true;
  try {
    await longTask();
  } finally {
    isRunning = false;
  }
});

Multi-Instance Duplication

Scale to three servers, and your daily email sends three times. Neither library knows about other instances. This is the "thundering herd" problem for cron.

Solutions include designating one instance as the cron runner, using distributed locks (Redis, database), or moving to a queue system with built-in coordination.

Approaches to Cron Job Monitoring

There's no single right answer. The best approach depends on your reliability requirements, infrastructure, and how much you want to build yourself.

Approach 1: Structured Logging

The simplest approach is logging job executions in a structured format you can query and alert on:

function createLoggedJob(name, schedule, task) {
  cron.schedule(schedule, async () => {
    const startTime = Date.now();
    const executionId = crypto.randomUUID();
    
    console.log(JSON.stringify({
      event: 'cron_job_started',
      job: name,
      execution_id: executionId,
      scheduled_time: new Date().toISOString()
    }));
    
    try {
      await task();
      
      console.log(JSON.stringify({
        event: 'cron_job_completed',
        job: name,
        execution_id: executionId,
        duration_ms: Date.now() - startTime
      }));
      
    } catch (error) {
      console.error(JSON.stringify({
        event: 'cron_job_failed',
        job: name,
        execution_id: executionId,
        duration_ms: Date.now() - startTime,
        error: error.message,
        stack: error.stack
      }));
      throw error;
    }
  });
}

Then set up alerts in your logging platform (Datadog, CloudWatch, Loki) for:

cron_job_failed events
Missing cron_job_completed events within expected windows
Duration exceeding thresholds

Pros: Uses existing infrastructure, no external dependencies. Cons: Detecting missing events is harder than detecting failures. You need to build the "expected vs actual" logic yourself.

Approach 2: Heartbeat Monitoring Services

External monitoring services flip the model: instead of looking for failures, they expect success pings. If the ping doesn't arrive on schedule, they alert.

The pattern is the same across services:

const axios = require('axios');

cron.schedule('0 * * * *', async () => {
  try {
    await performBackup();
    
    // Ping on success
    await axios.get('https://monitoring-service.com/ping/your-job-id');
    
  } catch (error) {
    console.error('Backup failed:', error);
    // Don't ping - missing ping triggers alert
  }
});

Several services offer this model:

Healthchecks.io is open-source and self-hostable. Generous free tier (20 jobs), simple HTTP ping API. Good if you want to self-host or prefer open source.

Cronitor offers more features: metrics, alerting integrations, a nice dashboard. Higher price point ($20+/month for meaningful usage).

Better Stack (formerly Logtail) combines uptime monitoring with heartbeats. Part of a larger observability platform.

CronRadar focuses specifically on cron monitoring with framework SDKs for Node.js, Python, Laravel, etc. Simple API, alerts via Slack/email/webhook.

Sentry Crons integrates with Sentry's error tracking. If you're already using Sentry, this is convenient:

const Sentry = require('@sentry/node');

Sentry.init({ dsn: 'your-dsn' });

cron.schedule('0 * * * *', async () => {
  await Sentry.withMonitor('hourly-backup', async () => {
    await performBackup();
  }, {
    schedule: { type: 'crontab', value: '0 * * * *' },
    timezone: 'UTC'
  });
});

Pros: Purpose-built for detecting missing executions. Quick setup, handles the hard parts. Cons: External dependency, ongoing cost, data leaving your infrastructure.

Approach 3: Build Your Own

For full control, build monitoring into your application:

const { Pool } = require('pg');
const pool = new Pool();

async function recordExecution(jobName, status, durationMs, error = null) {
  await pool.query(`
    INSERT INTO cron_executions (job_name, status, duration_ms, error, executed_at)
    VALUES ($1, $2, $3, $4, NOW())
  `, [jobName, status, durationMs, error]);
}

async function checkMissedJobs() {
  // Find jobs that should have run but didn't
  const result = await pool.query(`
    SELECT job_name, expected_schedule, last_execution
    FROM cron_jobs
    WHERE last_execution < NOW() - (interval_minutes || ' minutes')::interval
  `);
  
  for (const job of result.rows) {
    await sendAlert(`Job ${job.job_name} missed expected execution`);
  }
}

// Run the checker itself on a schedule
cron.schedule('*/5 * * * *', checkMissedJobs);

Pros: Full control, no external dependencies, integrates with your existing database. Cons: Significant implementation effort, you're responsible for reliability of the monitoring system itself.

Approach 4: Prometheus Metrics

If you're running Prometheus, expose cron job metrics:

const promClient = require('prom-client');

const jobDuration = new promClient.Histogram({
  name: 'cron_job_duration_seconds',
  help: 'Cron job execution duration',
  labelNames: ['job_name', 'status']
});

const jobLastSuccess = new promClient.Gauge({
  name: 'cron_job_last_success_timestamp',
  help: 'Timestamp of last successful execution',
  labelNames: ['job_name']
});

function createMetricsJob(name, schedule, task) {
  cron.schedule(schedule, async () => {
    const end = jobDuration.startTimer({ job_name: name });
    
    try {
      await task();
      end({ status: 'success' });
      jobLastSuccess.setToCurrentTime({ job_name: name });
    } catch (error) {
      end({ status: 'failure' });
      throw error;
    }
  });
}

Then alert on time() - cron_job_last_success_timestamp > threshold:

# Prometheus alerting rule
- alert: CronJobMissed
  expr: time() - cron_job_last_success_timestamp > 7200
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Cron job {{ $labels.job_name }} hasn't run in 2+ hours"

Pros: Integrates with existing observability stack, rich alerting capabilities. Cons: Requires Prometheus infrastructure, more complex setup.

Production Patterns

Beyond monitoring, production cron jobs need proper error handling, graceful shutdown, and coordination across instances.

Graceful Shutdown

When your process receives SIGTERM, let running jobs finish before exiting:

const tasks = [];
let activeJobs = 0;
let isShuttingDown = false;

function registerJob(schedule, task) {
  const wrapped = async () => {
    if (isShuttingDown) return;
    activeJobs++;
    try {
      await task();
    } finally {
      activeJobs--;
    }
  };
  
  const scheduled = cron.schedule(schedule, wrapped);
  tasks.push(scheduled);
  return scheduled;
}

async function shutdown(signal) {
  console.log(`${signal} received, shutting down`);
  isShuttingDown = true;
  tasks.forEach(t => t.stop());
  
  // Wait up to 30s for active jobs
  const deadline = Date.now() + 30000;
  while (activeJobs > 0 && Date.now() < deadline) {
    await new Promise(r => setTimeout(r, 1000));
  }
  
  process.exit(activeJobs > 0 ? 1 : 0);
}

process.on('SIGTERM', () => shutdown('SIGTERM'));
process.on('SIGINT', () => shutdown('SIGINT'));

node-schedule has this built in:

process.on('SIGTERM', async () => {
  await schedule.gracefulShutdown();
  process.exit(0);
});

Single-Instance Cron in Scaled Deployments

Designate one instance as the cron runner:

// Environment-based
if (process.env.RUN_CRON === 'true') {
  require('./cron-jobs');
}

// Or PM2 ecosystem.config.js
module.exports = {
  apps: [{
    name: 'app-cron',
    script: './server.js',
    instances: 1,
    env: { RUN_CRON: 'true' }
  }, {
    name: 'app-web',
    script: './server.js',
    instances: 'max',
    env: { RUN_CRON: 'false' }
  }]
};

For dynamic leader election, use Redis or your database:

const Redis = require('ioredis');
const redis = new Redis();

async function acquireLock(jobName, ttlSeconds) {
  const lockKey = `cron-lock:${jobName}`;
  const result = await redis.set(lockKey, process.pid, 'EX', ttlSeconds, 'NX');
  return result === 'OK';
}

cron.schedule('0 * * * *', async () => {
  const acquired = await acquireLock('hourly-job', 3600);
  if (!acquired) {
    console.log('Another instance has the lock, skipping');
    return;
  }
  
  await performJob();
});

Retry with Backoff

For transient failures, retry before giving up:

async function withRetry(fn, maxAttempts = 3, baseDelayMs = 1000) {
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    try {
      return await fn();
    } catch (error) {
      if (attempt === maxAttempts) throw error;
      
      const delay = baseDelayMs * Math.pow(2, attempt - 1);
      console.log(`Attempt ${attempt} failed, retrying in ${delay}ms`);
      await new Promise(r => setTimeout(r, delay));
    }
  }
}

cron.schedule('0 * * * *', async () => {
  await withRetry(async () => {
    await syncExternalData();
  });
});

When to Move Beyond Simple Cron

node-cron and node-schedule work well for lightweight, fire-and-forget tasks. Consider queue systems like BullMQ or Agenda when you need:

Persistence: Jobs survive restarts. Failed jobs can be retried later.

Distributed processing: Multiple workers coordinate automatically. No duplicate executions.

Visibility: Query job status, inspect queues, see failure reasons.

Backpressure: Control concurrency, implement rate limiting, prioritize urgent work.

// BullMQ example
const { Queue, Worker } = require('bullmq');

const queue = new Queue('tasks', { connection: redis });

// Schedule recurring job
await queue.upsertJobScheduler('daily-report',
  { pattern: '0 9 * * *' },
  { name: 'generate-report' }
);

// Process jobs
const worker = new Worker('tasks', async (job) => {
  if (job.name === 'generate-report') {
    await generateReport();
  }
}, { connection: redis });

The tradeoff is complexity: you need Redis (BullMQ) or MongoDB (Agenda), plus the operational overhead of maintaining that infrastructure.

Testing Scheduled Tasks

Time-dependent code is hard to test. Two patterns help:

Separate Scheduling from Logic

// tasks/cleanup.js - pure logic, easy to test
async function cleanupOldRecords(olderThanDays = 30) {
  const cutoff = new Date();
  cutoff.setDate(cutoff.getDate() - olderThanDays);
  
  const result = await db.query(
    'DELETE FROM logs WHERE created_at < $1',
    [cutoff]
  );
  return result.rowCount;
}

module.exports = { cleanupOldRecords };

// cron.js - just wiring
const { cleanupOldRecords } = require('./tasks/cleanup');
cron.schedule('0 3 * * *', cleanupOldRecords);

Now test cleanupOldRecords directly without involving cron.

Fake Timers for Schedule Testing

const sinon = require('sinon');

describe('Scheduled cleanup', () => {
  let clock;
  
  beforeEach(() => {
    clock = sinon.useFakeTimers(new Date('2025-01-01T00:00:00Z'));
  });
  
  afterEach(() => {
    clock.restore();
  });
  
  it('runs at 3 AM', async () => {
    let executed = false;
    cron.schedule('0 3 * * *', () => { executed = true; });
    
    // Advance to 3:00 AM
    await clock.tickAsync(3 * 60 * 60 * 1000);
    
    expect(executed).toBe(true);
  });
});

Summary

Node.js cron jobs fail silently because the underlying libraries are in-memory schedulers with no built-in monitoring. Unhandled async errors, process restarts, and timezone edge cases all cause jobs to stop running without any alert.

The fix is adding visibility through one of several approaches:

Structured logging with alerts on missing events works if you have good log infrastructure
Heartbeat monitoring services (Healthchecks.io, CronRadar, Sentry Crons, etc.) handle the hard parts of detecting missing executions
Prometheus metrics integrate with existing observability stacks
Custom solutions give full control at the cost of implementation effort

For production deployments, also implement graceful shutdown, single-instance coordination in scaled environments, and consider whether your reliability requirements warrant moving to a queue system.

The goal isn't perfection—it's knowing when something breaks before your users tell you.