Reactor — Deploy Real-Time Video Models

You have a working inference pipeline: a diffusion model, a world simulator, a style transfer net. Here’s how to make it real-time on Reactor without rewriting your inference code.

The starting point

A typical research pipeline looks like this: load weights once, take all inputs upfront, run inference, return the result.

class OriginalPipeline(torch.nn.Module):
    def __init__(self, config):
        self.net = load_checkpoint(config["path"])

    def inference(self, prompt, num_frames=100):
        frames = []
        for step in range(num_frames):
            frame = self.net.forward(prompt=prompt, step=step)
            frames.append(frame)
        return np.stack(frames)

The problems for real-time: all inputs provided upfront, no frames until the end, no way to change parameters mid-generation.

The conversion

You don’t need to modify your pipeline code. Wrap it in a ReactorPipeline:

Declare your tracks

What does the model send to the client?

@dataclass
class MyOutput(Output):
    main_video: Video

Declare your state

What can the client change in real-time?

@dataclass
class MyState(InputState):
    prompt: str = InputField(default="a sunny meadow")
    guidance_scale: float = InputField(default=7.5, ge=1.0, le=20.0)

Write the model class

Load your pipeline in load(). Drive it from inference(). Instead of collecting all frames and returning them at the end, yield each frame as it’s produced.

class MyModel(ReactorPipeline):
    state: MyState

    def load(self, config):
        self.pipe = OriginalPipeline(config)

    def inference(self):
        for step in range(self.pipe.num_steps):
            frame = self.pipe.net.forward(
                # ✅ Prompt comes from live state, not a function argument
                prompt=self.state.prompt,
                step=step,
            )
            # ✅ Yield each frame instead of collecting them
            yield MyOutput(main_video=frame)

If your pipeline produces multiple frames at once, yield them as a batch. The runtime splits the batch and emits each frame smoothly at the target rate.

    def inference(self):
        frames = self.pipe.net.forward(prompt=self.state.prompt)
        print(frames.shape) # (N, H, W, 3)
        yield MyOutput(main_video=frames)

If your model generates frames in batches, yield the entire batch in a single yield. The runtime will split it and emit each frame at the target rate.

Add reactor.yaml

model: model:MyModel
name: my-model
config: config.yml

Run it

reactor run

What changed

The original pipeline’s forward() call is unchanged. You’re just calling it from a different loop:

Prompt is no longer passed once upfront. It comes from self.state.prompt, updated live by the client.
Frames are no longer collected in memory. Each yield MyOutput(main_video=frame) streams the frame to the client immediately.
No batch return. There’s no np.stack(frames) at the end. Frames are delivered as they’re generated. If the generator finishes, the runtime starts it over.

Going further

The generator pattern covers most real-time models. For cases where you need full control over the execution loop, ReactorModel lets you handle connections, state, and frame emission directly while keeping the same tracks, events, and messaging.

Overview

Back to the Runtime overview.

Quickstart

Revisit the quickstart for reference.

Get Started

Development

Media & Transport

Deployment

Security

Bring Your Own Model

The starting point

The conversion

What changed

Going further

Next

Overview

Quickstart

Get Started

Development

Media & Transport

Deployment

Security

​The starting point

​The conversion

​What changed

​Going further

​Next

Overview

Quickstart

The starting point

The conversion

What changed

Going further

Next