Beyond TinyML: Balance inference accuracy and latency on MCUs

Name: Beyond TinyML: Balance inference accuracy and latency on MCUs
Start: 2026-01-31T11:50:00+01:00
End: 2026-01-31T12:10:00+01:00
Location: UD2.120 (Chavanne)

Anastasia Mallikopoulou, Charalampos Mainas, Anastassios Nanos

Abstract

Can an ESP32-based MCU run (tiny)ML models accurately and efficiently? This talk showcases how a tiny microcontroller can transparently leverage neighboring nodes to run inference on full, unquantized torchvision models in less than 100ms! We build on vAccel, an open abstraction layer that allows interoperable hardware acceleration and enable devices like the ESP32 to transparently offload ML inference and signal-processing tasks to nearby edge or cloud nodes. Through a lightweight agent and a unified API, vAccel bridges heterogeneous devices, enabling seamless offload without modifying application logic. This session presents our IoT port of vAccel (client & lightweight agent) and demonstrates a real deployment where an ESP32 delegates inference to a GPU-backed k8s node, reducing latency by 3 orders of magnitude while preserving Kubernetes-native control and observability. Attendees will see how open acceleration can unify the Cloud–Edge–IoT stack through standard interfaces and reusable runtimes.

When Jan 31, 2026 11:50 AM — 12:10 PM

Where UD2.120 (Chavanne) Brussels,

Event FOSDEM 2026

Watch the Talk

Watch on External Platform

Code & Resources

Cloud-Native IoT ML TinyML Vaccel

Beyond TinyML: Balance inference accuracy and latency on MCUs

Abstract

Watch the Talk

Code & Resources

Anastasia Mallikopoulou

Junior Systems Engineer - Observability & Applied ML

Charalampos Mainas

Systems Researcher

Anastassios Nanos

Systems Researcher