๐Ÿ† 9.5 Arena โ€” No Docker Required

VoxBar Pro Native

S-tier accuracy without Docker. One-click install, real-time transcription.

Powered by Voxtral F16 4B (Mistral AI)

$59 one-time ยท ๐Ÿ”ฅ LAUNCH: $29.50 with code EARLYBIRD

How It Works

VoxBar Pro Native runs the same Voxtral 4B model as VoxBar Pro Docker โ€” but natively on Windows without any Docker, WSL, or container overhead. It uses a lightweight Python inference server powered by Hugging Face Transformers.

Here's what happens:

  1. One-click launch โ€” double-click the VoxBar shortcut. The F16 model loads directly onto your GPU via CUDA
  2. Captures your microphone at 16kHz, processing audio in efficient chunks
  3. Runs inference natively โ€” the Voxtral F16 4B model processes audio directly on your GPU without containerisation overhead
  4. Transcription appears in real-time โ€” words show up as you speak, with sub-200ms latency
  5. No background services โ€” when you close VoxBar, everything stops. No Docker daemon, no lingering containers

This is the same S-tier model as Pro Docker, but running natively. You get 9.5 combined Arena accuracy with 40% less VRAM (8.5GB vs 14GB).

Why Native?

Docker adds overhead. With Pro Native:

  • No Docker Desktop required โ€” saves ~2GB RAM and eliminates WSL2 complexity
  • 40% less VRAM โ€” F16 precision uses 8.5GB vs Docker's 14GB allocation
  • Instant startup โ€” model loads in seconds, not minutes
  • Rock-solid sessions โ€” no WebSocket drops, no reconnection delays
  • Cleaner system โ€” nothing runs in the background when VoxBar is closed

Recording Limits

VoxBar Pro Native Has No Practical Recording Limit

Because Pro Native runs natively on your GPU with no Docker, no WebSocket, and no container overhead, it can record continuously for as long as you need โ€” hours, all day if required.

Why It Runs Forever

  • Each audio chunk is processed independently โ€” no state carries over between chunks
  • GPU memory is fixed at ~8.5GB โ€” the same model processes the same input size every time
  • No WebSocket connection to drop, no Docker container to restart
  • No context window that fills up or degrades over time

Auto-Stop Behaviour

  • Silence timeout: 5 minutes of no detected speech triggers auto-stop
  • This keeps your GPU free when you're not actively dictating

Real-World Testing

During live testing, VoxBar Pro Native ran continuously for over 50 minutes of natural dictation with zero interruptions, zero restarts, and zero degradation. The text quality at minute 50 was identical to minute 1.

Memory & Resource Footprint

Resource Usage Behaviour Over Time
GPU VRAM ~8.5GB (Voxtral F16 4B model) Stable โ€” no KV cache accumulation
RAM ~1-2GB (Python process) Stable
Disk Zero temp files Audio processed in memory, never written to disk
Network None Fully offline โ€” no localhost server, no internet

Accuracy & Speed

Metric Value
Arena Score 9.5 combined โ€” tied S-tier
Delivery Real-time โ€” words appear as you speak
Latency <200ms from speech to text on screen
Multilingual Yes โ€” 13 languages supported
Punctuation Context-aware, appears naturally
Capitalisation Automatic, intelligent

Hardware Requirements

Requirement Minimum Recommended
GPU NVIDIA with 10GB VRAM NVIDIA with 12GB+ VRAM
RAM 8GB 16GB
Disk ~8.5GB for model SSD recommended
OS Windows 10/11 Windows 11
Software NVIDIA drivers + CUDA Latest NVIDIA drivers

License & Attribution

Detail Value
Model Voxtral-Mini-4B-Realtime-2602 (F16)
Creator Mistral AI
License Apache 2.0 (fully commercial)
Attribution Not required (but appreciated)
Distribution Can be bundled and sold commercially

Pro Native vs Pro Docker

Both run the same Voxtral 4B model. The difference is how it's deployed:

Feature Pro Native Pro Docker
Arena Score 9.5 combined 9.6 combined
Docker required No Yes
VRAM usage ~8.5GB ~14GB
Install One-click Docker Desktop + pull image
Session stability Rock solid Stable
AMD GPU Not supported Not supported
macOS See Mac Models โ€” native Apple Metal build available separately
Best for Windows users who want simplicity and lower VRAM Power users who want the highest arena score

VoxBar Pro Native is for Windows users who want S-tier accuracy with the simplest possible setup and 40% less VRAM. VoxBar Pro Docker achieves a slightly higher arena score (9.6 vs 9.5) but requires Docker Desktop and 14GB VRAM.