Adaptive Voice(in submission for CHI 24,more details will released after publication )

Photo by rawpixel on Unsplash

In an era where technology seamlessly intertwines with daily life, the role of voice assistants has become increasingly pivotal. However, a critical oversight has persisted in their design — their inability to adapt to the varying mental states of users. Current voice assistants, while efficient, often present messages in a rigid, predefined format, neglecting the dynamic nature of human cognitive load. This static approach can lead to significant discrepancies in user experience; individuals burdened with a heavy mental load might find these messages to be overwhelming, while those at full cognitive capacity might seek more nuanced and detailed information.

This one-size-fits-all model of voice communication overlooks a fundamental aspect of human interaction: the ability to perceive and adapt to the listener’s mental state. Just as a skilled storyteller gauges their audience’s engagement and tailors their narrative accordingly, voice interfaces, too, require this level of adaptability to be truly effective.

Recognizing this gap, our study proposes a novel optimization-based approach to voice interface design. This approach is not just about tweaking the level of detail or the speed of speech; it’s about a paradigm shift towards a more empathetic and responsive technology. We envision voice assistants that can dynamically alter their message presentation. Furthermore, this adaptation isn’t sporadic; it maintains a consistent flow, ensuring that the temporal presentation of messages remains coherent and contextually relevant.

Such an advancement has profound implications. It’s not merely a technical enhancement; it’s a step towards more humane and intuitive AI. By adapting to the user’s mental state, voice interfaces can transform from mere tools into empathetic companions, capable of offering assistance that’s tailored not just to the task at hand, but to the psychological state of the user. This study, therefore, not only addresses a technological challenge but also embarks on a journey towards redefining the interaction between humans and machines.

The data from both studies were analyzed to assess the effectiveness of the Linear Integrated Optimization algorithm in adapting voice messages to the users’ cognitive loads. Insights from this analysis were used to refine the algorithm, ensuring that it not only meets but exceeds user needs in terms of responsiveness, accuracy, and overall satisfaction.

Songming Ping
Songming Ping
Imperial College London | MRes Medical Robotics and Image-Guided Intervention

My research interests include human-computer interactive devices, reinforcement learning algorithms, deep learning algorithms, interactive technologies, digital twins, and automated systems.