idle intelligence

sts-web

NVIDIA PersonaPlex-7B — pruned to 24 layers, recovered with QLoRA, quantized to Q4_K. Client-side inference: WebGPU via Burn + Rust/WASM.

⚠ Work in progress — runs slower than realtime without a powerful GPU. Expect delays during audio generation.

status

Checking browser support...

Desktop only. Requires discrete GPU or Apple Silicon with 16GB+ RAM.

VOICE

No voice prompt — speaker identity sampled from prior distribution

Audio stays on this device. All inference runs locally in your browser.

ms/frame

Frames/sec

RTF

Temporal (ms)

Depth (ms)

Mimi (ms)