Why voice search is just the beginning

Author
Craig Pugsley
Director & Creative Lead | StudioFlow
13th June 2018

There’s a war raging. It’s being fought in living rooms, bedrooms and kitchens across the world and there’s so much at stake, neither side can afford to lose an inch. Battles have been won already, each side’s resources are nearly limitless, and battle scars get treated as learning opportunities - repaired stronger, with better understanding. The war is for your attention. For your interaction. For, as it always is, your money. This, is the war of the Conversation Titans.

These Titans are some of the world’s biggest companies, and they’re deeply invested in having the conversational platform everyone uses. If you own the platform, you command the route to consumers. 

The latest market figures pitch Amazon as having shipped nearly 10 million smart speaker devices in Q4 2017, with Google shipping nearly 7 million. Analysts are reporting there to have been around 24 million smart speakers shipped globally in 2017 alone - a 300% growth increase over 2016 - with predictions of 56 million to be shipped in 2018. As of Q2 2018, Amazon is live in 6 countries, with plans to roll out to 80 additional countries soon. Google just announced equally aggressive plans at their recent developer conference - 80 countries and 30 languages by the end of 2018.

With this scale in mind, how do you prepare your business for the paradigm shift?

Tips to optimise for voice search

The truth of the matter is that you’re probably already optimising for voice search – if you have a good SEO strategy and are trying to provide actionable useful content to get into the top organic search results on mobile, then you’re good to go with voice too. However, it’s not that simple, so here are a few top tips you can employ to optimise for voice. 

The first is to make sure your content is structured in such a way that Google can use it to answer a question. Voice search users won’t be using traditional keyword searching – they’ll be phrasing their searches as questions (‘how do I bake a soufflé?’), so you need to make sure you’re structuring your content as answers to those questions. Lead with a direct answer to the point your trying to make (journalists call this the ‘inverted pyramid’), and then follow up with deeper explanation. Google will reward this and it may help get your content ranked higher. Also, make sure your content is unique and interesting. Seems redundant to say, but if Google can already provide an answer (‘what’s the capital or Norway?’), trying to get your content to rank higher than Google’s is almost certainly a lost cause.

The second top tip is an obvious one, but it’s worth being explicit about. Getting noticed on mobile searches is harder, as there are fewer results shown on the small screens of mobile devices. For voice, unless you rank #1, your content doesn’t stand a chance of being presented to users. Google doesn’t offer up the top three results – it picks the best and speaks that to the user. If you’re choosing where to spend your SEO time and money, spend it on voice search ranking only if you think you have compelling enough content to top rank. 

For both these reasons (ranking for voice search is hard!), you may be better off thinking about other ways to have a presence on the voice and chat platforms. That is why the focus of this article is going to be beyond voice search – building apps that can respond conversationally to engage your users in ways voice search never will.

Going beyond voice search

Building conversational experiences that allow your customers to literally talk to your brand to buy your products, ask questions about your services or engage directly with your marketing should be a key part of your business vision. 

Conversational apps represent the fifth wave of computer interaction: all the way from IBM punch-cards and the first point-and-click iMac interfaces, through smartphones and touchscreens, to voice. 

Interacting with computers conversationally (whether directly with voice, or via typed chat responses) will become the dominant way we get stuff done with technology in the future. 

So to understand the opportunities of building conversational apps, we need to understand where the tech is now, and where it’s heading in the future. The truth is voice and chatbots are already being used in anger. You’ll be decided when, rather than if, you’ll be jumping on the conversation train. Spending too long deliberating (like you did on the move from desktop to mobile?) will simply mean the tech gets away from you, and your competitors steal the march. Now’s the time to board. The train’s leaving, and you need to be on it.

Why are the tech giants investing so aggressively in conversation?

OK, so the installed base is huge. But why are these big companies investing so aggressively in conversation? What’s the motivation? The business objectives haven’t changed, but these players realised the opportunity for brand new routes to market. 

Amazon want to own the shopping experience - they want to ensure you’re never more than a few feet from a purchase opportunity. Google want to own search, and weave advertising between results. These two goliaths are duking it out for a place in your living room with devices like the Echo Plus and Google Home, but they both realise hardware is a bottleneck. 

Both companies have, very smartly, decoupled the brains inside these devices from the hardware itself. Amazon positively encourage other manufacturers to embed Alexa in whatever they’re building. Amazon isn’t making money selling smart speakers, they’re playing the long game, and know that having Alexa in your home, your car, your watch, your office means you’ll become more and more reliant on its ease and convenience. 

Google have done the same with Google Assistant – also available on pretty much every modern Android phone, Android smartwatch, Android Auto, etc... By separating the voice from the brains, both companies can use other vendors to increase their reach. It’s all very savvy.

So why has it taken so long for conversional platforms to become a thing? 

Good question. The answer is easy to give and complex to explain. The easy answer is that humans have been talking to computer systems for years. The Turing Test (a thought experiment devised by Alan Turing to measure the sophistication of so-called ‘artificial intelligence’ systems) was posed in the 50s. The change is that the platforms on which these AIs ran have become so much more powerful, efficient and affordable in recent years. 

The movement towards cheap cloud computing with the ability to automatically scale based on demand, the proliferation of fast internet, wifi speeds, and step-changes in how data is stored and accessed have all contributed to delivering a more tolerable consumer experience than ever before.

Speaking to Alexa seems easy and straightforward. You ask for a weather update, after a tiny pause Alexa tells you. Simple, right? What you don’t see is the layers of technology decoding your voice’s waveform into data, transmitting it to Amazon’s cloud servers, deciphering that waveform into a sentence of spoken words, breaking up that sentence into nouns, verbs, adjectives, trying to make sense of what you said (did you want the weather forecast or the traffic? And where did you want it for? Home or work?) - doing the actual computation to answer your question, then reversing the process, constructing a meaningful sentence most humans would understand, converting those words into syllable sounds that don’t sound artificial, creating the sound file and finally sending the audio back down the internet connection for your speaker to play. 

And all this has to happen in less than half a second to feel natural and conversational. Almost mind-blowingly complex computing happening there, and only now possible due to advances in cloud computing. 

So, let’s take a closer look at the big players. Amazon’s focus is Alexa (the brains, remember!) but Amazon also sell a range of devices in multiple form factors. Their most popular devices are smart speakers: the Echo, and Echo Dot. Amazon are by their nature an ultra-experimental company and will put significant resources into bringing something to market as a way to validate product feasibility. Could a screen make up for some of voice’s limitations with item selection (allowing customer to make selections from lists is key to e-commerce!)? Could a camera give us another sense, so that we can apply deep learning and sell more stuff? Amazon’s conversational experiences are voice-first - even the devices with screens use it mainly to present information for users to choose from. 

Google’s reach is staggering. They too have smart speakers, with the air-freshener like Google Home and Google Mini. Their brain is called the ‘Google Assistant’ and differs from Amazon’s in that it is a hybrid voice, touch and type interface that seamlessly supports whatever input you give it. In the car? Use voice only. On phone? Start with voice, make selections via touch, finish with typing. On watch? Assistant pro-actively sends you a notification, you interact with touch, then voice to send reply.

Facebook’s focus is bots. Rumours have them releasing their own smart speaker soon - inevitably powered by voice. The Facebook Messenger chatbot platform has a rich set of interactions (buttons, product carousels, video, email, ability to hand-off to human), all with the same common look and feel. This means Facebook’s chatbots are super efficient, low-cost and result in high customer-engagement. 

Facebook have reported case studies in which Messenger chatbots are able to return consumer satisfaction scores which are orders of magnitude better traditional channels. Not only that, but they provide built-in opportunities for CRM and lookalike-audience based marketing. And they’re easy for your customers to engage with too - simply replace the ‘Send a message’ button on your Facebook business Page, or embedded in your website. In terms of bang-for-buck, building a Facebook Messenger chatbot is probably one of the most powerful marketing and sales tools for small to medium businesses, and a great way to scale customer service. Just think: you’re available 24/7 to answer 90% of your product and care requests (and how many times do you get asked the same top five issues?).

This isn’t some future vision. This isn’t simply a picture of technology’s path in the coming years. These are consumer experiences that are live, installed in millions of homes around the world, gaining massive traction (especially in the UK, where adoption appears to be outpacing the USA?) and available to you right now. 

As a real-world example, two years ago the first Alexa and Google Assistant app for the takeaway food company Just Eat launched. These were two of the first experiences in the UK to offer e-commerce via voice, and these platforms are live right now making money and serving customers wherever and whenever they’re hungry 24hrs a day.

How can I ensure my business has a conversational presence?

Building for these platforms isn’t simply a case of porting your app or website. Voice and chatbot apps are very different beasts. Your time with the user is limited to how long it takes to say something, so you have to be to the point and get the job done efficiently. 

Choosing which features to offer via voice or chat is super-critical. Not everything your current app does will suit voice, so this as an opportunity to focus on your core value proposition. 

Why do your consumers come to you? If you can nail that, and give them what they want in an interaction that’s quicker and more efficient, they’ll keep coming back time and time again. Finding that sweet-spot isn’t magic - it’s part science, part art. 

Understanding what your customers really want to do, how voice can help them do it and reducing friction to the point where the experience is seamless, comes from taking a strategic and skillful approach. 

So, now’s the time to board the train. Don’t stand on the platform, thinking that you can wait for the next one to come along (remember missing the desktop to mobile transition train?). You don’t know who’s already on it, and in this tech-forward world, you don’t want to be left at the station… 

About the author

Craig is the Director and Creative Lead at StudioFlow, and has been obsessed with tehnology since his first Spectrum in the 80s. Since studying AI and machine learning at university, he has helped found a tech startup, worked for two of the biggest tech companies in the world, and was Creative lead in a FTSE 100's tech innovation product research lab.

 

  • Join the Chamber

    Be a part of the largest business membership organisation in the region and tap into a range of valuable business benefits.

Do you want to join the conversation?

Sign up here
  • Join the Chamber

    Be a part of the largest business membership organisation in the region and tap into a range of valuable business benefits.