Tether’s QVAC pushes multi‑billion‑parameter AI models onto phones and consumer GPUs

0
0

Tether has launched QVAC Fabric with BitNet LoRA, a framework that can train and run multi‑billion‑parameter AI models directly on consumer GPUs and flagship smartphones. This marks a big step for on-device AI, moving it beyond small demo models to something more powerful and practical.

The framework works across AMD and Intel GPUs, Apple’s Metal stack, and high-end mobile GPUs. Tether says it achieves 2–11× speed improvements over CPU baselines and up to 90% lower memory usage, letting devices handle larger models or multiple sessions at once.

Tether claims it has fine-tuned models up to 3.8 billion parameters on devices like the Pixel 9, Galaxy S25, and iPhone 16, and up to 13 billion parameters specifically on the iPhone 16. If accurate, this could allow serious AI work—like personalization and domain-specific training—to happen locally on the device, without sending user data to the cloud.

This release fits Tether’s strategy of moving beyond stablecoins into broader infrastructure, complementing previous projects like the 41-billion-token Genesis I dataset and the local AI Workbench. The QVAC and BitNet LoRA code is open-sourced on GitHub, letting developers experiment and build on Tether’s tools.

The company is signaling that as AI moves to the edge, control over toolchains and device-level AI infrastructure becomes strategically important. For now, technical questions remain: how real-world speed, energy use, and thermal limits compare to existing solutions, and how permissive the licensing is for commercial use.

If Tether’s claims hold up, QVAC Fabric could make high-end smartphones viable for training and running mid-sized AI models, bringing AI closer to the edge and giving Tether a foothold in critical digital infrastructure.