MDK Logo

ORK

Internals of the @tetherto/mdk-ork kernel — modules, the pull-only model, and system recovery

@tetherto/mdk-ork is the trusted coordination layer of the stack. It splits internal responsibilities across single-purpose modules with their own state machines, so domains can evolve independently without coupling to each other.

The design is inspired by Kubernetes: a pull-only model bounds the pace of execution so the kernel cannot be overwhelmed by upstream pressure.

Module overview

Module catalogue

@tetherto/mdk-ork's coordination splits across single-purpose modules. Each owns its own state machine, persistence boundary, and scaling characteristics. Six modules ship in v0.0.1; two more are deferred to a later release.

ModuleRole
Command DispatcherValidates and resolves the destination Worker for incoming commands.
Command State MachineTracks command lifecycle from QUEUED to SUCCESS or FAILED.
Worker RegistryAuthoritative lookup of active Workers, their RPC keys, and managed devices.
Telemetry CollectorStateless proxy between callers and Worker-local telemetry stores.
SchedulerSystem metronome; drives all interval-based pulls (telemetry, state, health).
Health MonitorLiveness probes against Workers; reports status to the Registry.
Fault Supervisor (deferred)Circuit-breaker patterns to contain cascading failures.
Concurrency Manager (deferred)Per-device locking and queue-depth limits.

For the full state machines, transition rules, interface signatures, and recovery details, see the ORK modules reference.

System recovery

On a full system crash and restart, @tetherto/mdk-ork modules orchestrate recovery without user intervention:

  1. Worker Registry loads last known Worker and device states from Hyperbee.
  2. Command State Machine sweeps the WAL for stranded EXECUTING tasks and forces them to timeout or retry.
  3. Health Monitor begins firing immediate pings to verify which Workers are still active.
  4. Connections: the network layer awaits incoming HRPC reconnect storms from persistent Workers.

Recovery is local and predictable. Worker crashes do not bring down the runtime; supervisors (PM2, Docker, Kubernetes) handle process restarts in multi-process deployments, and Workers rejoin the system after recovery.

Next steps

On this page