Microsoft, Amazon, Baidu and other large tech firms are investing billions in gathering terabytes of authentic human conversation data to provide raw material to be crunched to boost the performance of AI voice-recognition software.
With artificial intelligence-powered personal assistants such as Amazon Echo now breaking into the market, the race is on to dramatically improve voice recognition software and drive their adoption into the mainstream.
Before voice recognition apps can conduct authentically human-style conversations, neural networks need more data to improve how they process voices and this has triggered a flurry of investment seeking to grow the database of recorded human speech from which the systems can learn.
To that end, Microsoft, Amazon, and Baidu are investing heavily to dramatically boost their catalogues of recorded human speech enabling machines to recognise a wider variety of human accents, dialects, and languages.
Microsoft, for example, is recording voices of volunteers living in simulated apartment settings, while Alexa, Echo's voice-recognition interface, uploads its voice interactions with users. Meanwhile Baidu has set itself the ambitious target of attempting to add all Chinese dialects to its database.