Skip to content

Conversation

@michaely520
Copy link
Contributor

What changed?

Scheduler Metrics Summary

Metrics Added

Metric Name Type Description
scheduler_active_workers Gauge Number of active workers processing tasks in the scheduler
scheduler_assigned_workers Gauge Number of workers currently assigned to a task queue
scheduler_queue_depth Gauge Number of tasks waiting in the scheduler queue
scheduler_active_queues Gauge Number of active task queues
scheduler_queue_latency Timer Time a task spent waiting in the scheduler queue before execution
scheduler_tasks_submitted Counter Total number of tasks submitted to the scheduler
scheduler_tasks_completed Counter Total number of tasks completed by the scheduler

Tags

All metrics are emitted with the following base tags:

Tag Values Description
queue_type replication The type of queue (only replication is enabled)
scheduler_type sequential_scheduler, iwrr The scheduler implementation
priority high, low Task priority level

Additional tags for IWRR scheduler metrics:

Tag Values Description
channel_key <cluster_name> The source cluster name (for per-channel metrics)

Scheduler Architecture

            ┌─────────────────────────┐
            │    IWRR Scheduler       │  ← queue_depth, active_queues, queue_latency
            │  (round-robin across    │    (per channel_key)
            │   source clusters)      │
            └───────────┬─────────────┘
                        │
                        ▼
            ┌─────────────────────────┐
            │  Sequential Scheduler   │  ← active_workers, assigned_workers,
            │  (per workflow ordering)│    queue_depth, active_queues, queue_latency,
            └─────────────────────────┘    tasks_submitted, tasks_completed

Metrics by Scheduler Type

SequentialScheduler emits:

  • scheduler_active_workers - total workers in the pool
  • scheduler_assigned_workers - workers currently processing a queue
  • scheduler_queue_depth - tasks in the dispatch channel
  • scheduler_active_queues - number of workflow-specific queues
  • scheduler_queue_latency - time from submit to execution start
  • scheduler_tasks_submitted - counter incremented on Submit()
  • scheduler_tasks_completed - counter incremented on Ack()/Nack()

InterleavedWeightedRoundRobinScheduler (IWRR) emits:

  • scheduler_queue_depth - per-channel (per source cluster)
  • scheduler_active_queues - total weighted channels
  • scheduler_queue_latency - time spent in IWRR channel before dispatch

Why?

Tell your future self why have you made these changes.

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

Potential risks

Any change is risky. Identify all risks you are aware of. If none, remove this section.

@michaely520 michaely520 changed the title Metrics wire q Wire metrics into schedulers for replication Jan 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants