The starting point
A typical research pipeline looks like this: load weights once, take all inputs upfront, run inference, return the result.The conversion
You don’t need to modify your pipeline code. Wrap it in aReactorPipeline:
Write the model class
Load your pipeline in If your pipeline produces multiple frames at once, yield them as a batch. The runtime splits the batch and emits each frame smoothly at the target rate.
load(). Drive it from inference(). Instead of collecting all frames and returning them at the end, yield each frame as it’s produced.If your model generates frames in batches, yield the entire batch in a single yield. The runtime will split it and emit each frame at the target rate.
What changed
The original pipeline’sforward() call is unchanged. You’re just calling it from a different loop:
- Prompt is no longer passed once upfront. It comes from
self.state.prompt, updated live by the client. - Frames are no longer collected in memory. Each
yield MyOutput(main_video=frame)streams the frame to the client immediately. - No batch return. There’s no
np.stack(frames)at the end. Frames are delivered as they’re generated. If the generator finishes, the runtime starts it over.
Going further
The generator pattern covers most real-time models. For cases where you need full control over the execution loop, ReactorModel lets you handle connections, state, and frame emission directly while keeping the same tracks, events, and messaging.Next
Overview
Back to the Runtime overview.
Quickstart
Revisit the quickstart for reference.