SMART SPEAKER TESTING
Speech and audio testing of smart speakers
Smart speakers have taken households by storm with consumer satisfaction closely linked to how well their smart speaker understands voice commands. Speech intelligibility and sound quality are crucial to owners of platforms like Amazon Alexa and Google Assistant. So how do you ensure your product will meet the exacting demands of consumers?
A common mistake is the use of a loudspeaker to measure the response accuracy rate (RAR) by replaying recorded voice commands and evaluating how often the voice command is correctly perceived and responded to. This can give a false indication of the performance because it doesn’t accurately reproduce the directivity and frequency response of a human voice.
Smart speaker testing standards
To ensure quality and protect their brand, leading smart speaker platforms such as Amazon, Google, Alibaba and Baidu set performance requirements for speech reproduction and recognition using devices which accurately match human voice characteristics. This makes it crucial for manufacturers wanting to integrate ‘smart’ functionality into their products to adhere to equipment recommendations from the platform suppliers.
Reproducing human speech and hearing
Brüel & Kjær offers market-leading products that facilitate more realistic speech synthesis and listening by faithfully reproducing human-like performance.
High-frequency Head and Torso Simulator (HATS)
The High-frequency HATS Type 5218 family by Brüel & Kjær is the new standard in the field of product audio evaluation. Equipped with ear and mouth simulator, High-frequency HATS allows accurate measurement over a full frequency range up to 20 kHz. Its capability of both issuing voice commands and measuring the quality of the smart speaker response makes fully automated testing of smart speakers and other voice operated devices possible.
Brüel & Kjær’s High-frequency Head and Torso Simulator complies with ITU-T P.58 standards for Objective Measurement Apparatus.
Mouth Simulator Type 4227 by Brüel & Kjær, is a high-performance artificial mouth, which simulates human speech dispersion patterns.
The mouth simulator’s compact packaging and rugged construction makes it perfect for use in R&D laboratories, or production test stands. The high quality of its construction provides reliable and repeatable measurements over extended periods of time, and complies with ITU-T P.51 testing standards.
Acoustic dispersion in speech
To reproduce the human voice and get a realistic test set-up for smart speaker testing, it is necessary to take acoustic dispersion into consideration.
Speech dispersion describes the attenuation of amplitude of the speech level with angle and distance. The ITU standard defines attenuation values referenced to the Mouth Reference Point (MRP), a position 25mm in front of the Lip Reference Plane (LRP) and defined in terms of dB attenuation relative to 65.3dB SPL at a distance of 500 mm in front of the MRP or 89.3 dB SPL at the MRP. Points are located on a circle centred at MRP and distributed in the horizontal plane at 0°, ±15°, ±30° and ±90° as well as in the vertical plane at ±15° and ±30°.