Enzo Ruedas received his MSc in Computer Science from INSA Lyon, France. He has been working at NXP Semiconductors for the past two years as part of the AI Software team, where he develops, fine-tunes, and optimizes machine learning models for embedded devices. His work focuses on generative models to enable conversational and physical AI directly on the edge.
What does it take to move robots beyond programmed behaviors into truly natural interaction? As conversational and physical AI matures, the missing piece is not intelligence alone, but orchestration. In this talk, we introduce a modular, end-to-end framework that enables robots to perceive, reason, and take actions through a unified pipeline.
We demonstrate how robots can become context-aware collaborators by structuring the system into four stages: attention, perception, reflection, and action. We will explore how to leverage multimodal signals from the very first moment of user intent, how perception extends beyond speech into a richer environmental understanding, and how emerging agentic AI paradigms enable more adaptive decision-making. Without diving into implementation specifics, this session highlights the architectural principles that make scalable, real-time conversational robotics feasible on embedded platforms.