Skip to main content
inference() is where your model produces output. It’s a Python generator that the runtime calls automatically. You write the loop, yield frames, and the runtime handles everything else.

Basic pattern

def inference(self):
    while True:
        frame = self.pipe.forward(prompt=self.state.prompt)
        yield MyOutput(main_video=frame)
Each iteration:
  1. Read self.state for current parameters.
  2. Run your forward pass.
  3. yield an Output instance to send it to the client.
The runtime drives the generator, measures compute time, and emits frames at the right rate.

Sync vs async

inference() can be a regular generator or an async generator. Use async when you need to await inside the loop, for example sending messages to the client.
async def inference(self):
    while True:
        frame = self.pipe.forward(prompt=self.state.prompt)
        # ✅ Await to send messages to the client
        await self.send(Progress(step=self._step))
        self._step += 1
        yield MyOutput(main_video=frame)

What to yield

Yield an Output instance with data for each track:
yield MyOutput(main_video=frame)
For single-track models, you can yield the raw frame directly:
yield frame

Batch yields

If your model generates frames in batches, yield the entire batch (N, H, W, 3) in a single yield. The runtime splits it and emits each frame at the target rate.
def inference(self):
    while True:
        frames = self.pipe.forward(prompt=self.state.prompt)
        print(frames.shape)  # (N, H, W, 3)
        # ✅ Yield the full batch — the runtime splits it
        yield MyOutput(main_video=frames)

Idling

Yield Idle or None to skip an iteration without emitting anything:
from reactor_runtime.interface import Idle

def inference(self):
    # Wait for the user to set a prompt
    while not self.state.prompt:
        # ✅ Skip this iteration — no frame emitted
        yield Idle
    while True:
        frame = self.pipe.forward(prompt=self.state.prompt)
        yield MyOutput(main_video=frame)

Lifecycle

The runtime manages the generator’s lifetime automatically:
  • Created when a client connects, after @connected fires.
  • Closed when the client disconnects. State is reset and buffers are flushed.
  • Restarted if it returns early (e.g. a finite loop ends). The runtime creates a new generator on the same model instance as long as the client is still connected.
A fresh self.state is created for each connection, so every client starts with clean defaults.

State consistency

Events are dispatched only between yield points. While your generator is running an iteration (reading state, doing a forward pass, awaiting I/O), no event handler can mutate self.state. You can safely read state multiple times within an iteration without worrying about concurrent changes.

Next

Interactive State

Add richer parameters with validation and custom logic.

Events & Messages

Custom events, lifecycle hooks, and outbound messages.