Microsoft has acquired conversational AI specialist Semantic Machines, in a move that reinforces its strategy to be a major force in natural language processing alongside Google, Amazon, IBM, and Apple.
The terms of the deal have not been announced.
The move is intended to boost the capabilities of Microsoft’s Cortana digital assistant, along with its chatbot programmes, such as XiaoIce in Asia – which was noticeable for not falling victim to the kind of social media trolling that its counterpart Tay did in the West two years ago.
In a blog post this morning, David Ku, CVP and chief technology officer of Microsoft AI & Research, said:
With XiaoIce and Cortana, we’ve made breakthroughs in speech recognition, and more recently become the first to add full-duplex voice sense to a conversational AI system, allowing people to carry on a conversation naturally.
Ku praised Semantic Machines’ achievements to date, saying that the company has developed “a revolutionary new approach to building conversational AI”.
“Their work uses the power of machine learning to enable users to discover, access, and interact with information and services in a much more natural way, and with significantly less effort,” he said.
The California-based Semantics Machines team includes a number of industry veterans with experience of building core AI technologies for Apple and Google. These include UC Berkeley professor Dan Klein and Stanford University professor Percy Liang, as well as former Apple chief speech scientist, Larry Gillick.
Microsoft now plans to build a conversational AI centre of excellence in Berkeley to “push forward the boundaries of what is possible in language interfaces”.
Alongside its progress to date with Cortana and XacoIce, Microsoft Cognitive Services boasts over one million developers and over 300,000 clients using the Azure Bot Service.
Taking the fight to Google
Microsoft was no doubt eager to respond to Google’s recent Duplex announcement, which appeared to demonstrate impressive progress in natural language processing, when the AI made a phone call on behalf of a user.
However, rumours have persisted that Google may have faked or edited the demo for its I/O conference, with the only certainty being that it has so far ignored journalists’ direct questions on the matter.
Amazon has recently announced upgrades to its Alexa digital assistant, which are designed to help the AI recognise users, learn their preferences, and sustain a conversation without listening for the ‘Alexa’ prompt each time.
Microsoft CEO Satya Nadella recently accused Amazon and Google of rigging markets by benefiting from both sides of AI-based transactions.
- Read more: Amazon: Alexa upgrades for memory, context, meaning = antitrust risk?
Meanwhile, in April Apple poached John Giannandrea, Google’s former head of search and artificial intelligence, as its new head of machine learning and AI strategy, and in March IBM launched an Assistant version of its Watson natural language AI system, which is available as an enterprise cloud service.
Internet of Business says
When Microsoft’s Tay chatbot debuted on Twitter in 2016, headlines such as ‘Microsoft deletes teen girl AI after it became a Hitler-loving sex robot in 24 hours’ (and that was from the Telegraph!) did little to bolster either Microsoft’s reputation, or that of conversational AI.
Since then, Microsoft and others have made good progress. Yet while there have been important strides in conversational AI in recent years, these are still early days for the industry. The speech capabilities of the AIs in films such as Ex Machina and Her remain some way off.
In many ways, the idioms and idiosyncrasies of human language are at odds with how AIs work best – as demonstrated by the AI pair that invented their own, more efficient language, during a Facebook research programme. Part of the challenge is that a lot of human communication is unspoken, implied, or heavily based on context and emotion. It also varies from culture to culture – and across hundreds of different languages.
Yet, thanks to machine learning, natural language processing has advanced to a degree that it is to beginning to comprehend the nuances of human speech, or at least imitate them in conversation.
Until now, AIs such as Apple’s Siri, Google’s Assistant, and Amazon’s Alexa have processed simple commands through NLP, but haven’t been capable of true conversation.
Alongside IBM’s Watson – which has helped humanoid robots carry out conversations, while linked to industry-specific data sets in the cloud – Google seems to be closest to unlocking this next step, though it comes with popular fears about the ability of AI to deceive or replace human beings.
Both Microsoft and IBM have stressed, however, that their cognitive services are designed to augment and complement human skills and ingenuity, not replace them.
While Google has clarified that its Duplex system will make its AI nature known during interactions with humans, many have concerns about technology systems apparently being designed to replace human beings. After all, why can’t companies simply employ people? The answer, of course, tends to be enterprises’ ongoing quest to take costs out of the business.
Nevertheless, the task ahead is monumental. A convincing AI, that will hold up in sustained, nuanced interaction, must successfully combine speech recognition, speech synthesis, deep learning, semantic understanding, machine learning, and linguistics.
Additional reporting: Chris Middleton.