Predicting Perceived Semantic Expression of Functional Sounds Using Unsupervised Feature Extraction and Ensemble Learning

Our Intention

We set out to explore whether UX sounds can be systematically described and predicted using methods from music information retrieval (MIR). In many contexts, from medical devices to cars, UX/UI sounds need to communicate critical information within seconds, often without any visual support. Yet despite their importance, their communicative effect is still rarely assessed in a systematic, data-driven way.

To change this, we conducted large-scale listening experiments with thousands of participants worldwide, evaluating thousands of real-world sounds from more than 100 manufacturers. Based on this unique dataset, we developed a data-driven framework that uses musically informed audio features to model the semantic impressions people typically associate with sounds from electronic products across industries such as automotive, health tech, consumer electronics, apps and smartphones.

What we found

Using a three-stage pipeline, we transformed functional sounds into high-level representations of timbre, chroma and loudness, and used these to train multi-output regression models predicting 19 perceptual dimensions from the FBMUX framework. The strongest results were achieved with a random forest regressor.

A subsequent listening experiment showed strong agreement between model predictions and user perception. Additional interpretability analyses also revealed which audio features were most influential in shaping specific semantic impressions. In other words, we can now estimate which information and brand values a sound is likely to communicate clearly and consistently to users.

This opens up new possibilities for sound design. Especially in high-pressure environments, sound must be immediately understandable, support safe workflows and still feel humane. Predicting user perception enables earlier iteration, reduces uncertainty and supports more confident design decisions before deployment.

Outlook

This project lays the foundation for AI-supported tools that help designers and manufacturers evaluate whether their sounds communicate the intended meaning and brand expression.

At the same time, the study shows that MIR can be successfully extended beyond music into the field of functional sound. Musically informed audio descriptors provide a powerful and transferable basis for perceptual modelling, opening new opportunities for innovation in sound design, human–machine interaction and applied audio research.