How it works
Build your model
Wrap your inference pipeline with the Reactor Runtime. Yield frames from a Python generator and the runtime streams them to clients over WebRTC.
Deploy
Authenticate, upload weights, push your Docker image, and publish. Your model is live on production GPUs in under 3 minutes.
Why Reactor
Sub-50ms streaming
Frames delivered over WebRTC as they are generated. Client inputs received live.
Stateful sessions
Each client gets isolated state, managed from connection to cleanup.
Global GPU network
Nodes in every major region. A client in Tokyo connects to a GPU in Tokyo.
No transport code
You never touch WebRTC, WebSockets, or video encoding. Reactor handles it.
Live in minutes
Publish and your model is running on production GPUs in under 3 minutes.
You own your model
Your weights, your inference logic. Reactor never accesses or trains on your data.
Get started
Development
Install the runtime, build your model, and test locally.
Deployment
Authenticate, upload weights, and go live on Reactor’s GPUs.