Crux Labs: Amazon Echo Dot + Voice-Based UX
Ever since science fiction movies of the 50s, we’ve imagined controlling our technology simply by speaking our desires out loud. We’re standing on the threshold of that reality.
With the developments in Natural Language Processing (NLP) and the introduction of more and more internet-connected devices, we’re getting close to a time when we’ll be able to carry on a conversation with our technology, and ask it to answer our questions and complete tasks for us as if we have live personal assistant.
And that’s the goal of the ‘home assistant’ products that are currently on the market. To give you a personal assistant, concierge-like, experience without the expense of a real live Mr. Carson (if you didn’t watch Downton Abbey, then insert Jeeves, Alfred, or another fictional super butler here).
How well does it work? We recently embarked on some exploration of voice-based user interfaces by conducting internal research using the Amazon Echo Dot and Alexa. In this report you’ll learn about voice user experience (UX) and we’ll provide things to keep in mind when you’re considering offering your product or service via a home assistant or other voice-controlled interface.
The Echo Dot is easy to set up, we started with familiar tasks and information, but the novelty wore off quickly if we didn’t find a new and meaningful use for it. Hands-free is great. Status is hard. Your voice needs to be at the right volume and accent-free. Start simple and with something valuable. Voice is a tactic, not an objective.
What did we learn?
We each spent a little time getting our Echo Dot set up and figuring how it works. Everyone reported that the initial set up was pretty straightforward. Once the device was plugged in, paired to the Wi-Fi network, and the Alexa app installed, the system was ready to go.
Initially, the device was a novelty. Many of us got started by asking Alexa to answer simple questions. Similar to when we got our first smart phones and explored the app store to see what this new device had to offer, many of us downloaded skills for services we currently used (e.g., Nest Thermostat, Spotify, Hue Lights), entertainment or media we already followed (e,g., sports teams, news outlets), or things that interested or amused us (e.g., box of cats).
As time passed, we used the device for a variety of things to varying degrees of success, but a few of us stopped using it all together. Those of us who integrated it into our lives and homes in a meaningful way found it useful and those of us who used it primarily as a novelty, eventually tired of it.
Most useful: when our hands were unavailable or occupied but our voice was free.
Least useful: needing ideal conditions (the right volume, tone of voice, or accent) to understand and respond to requests.
What did we like?
Nearly all of us tried using Alexa to set timers in the kitchen. The idea of being able to request a timer be set by speaking was really helpful when multi-tasking in the kitchen, particularly if your hands are occupied or dirty. Being able to set multiple timers via voice and the ability to label them was very helpful as long as it worked well, and you realized that was an option.
Annette talking about timers and baking.
“Alexa, set the timer for 3 minutes.”
For one coworker, the Echo Dot solved a specific problem with streaming music by serving as a wireless hub for a Bluetooth speaker. Previously, connecting to the speaker via his phone meant that he had to leave his phone close to the speaker or he may lose connectivity as he moves through the house. Launching Spotify from the Echo Dot and asking it to play via the Bluetooth speaker solved this problem.
Tony on how it solved the Sonos speaker issue.
“The Echo Dot started working like a gateway.”
Facts & Jokes
Many of us liked being able to ask for details like weather, commute times, sports scores, news and other facts and definitions. Most of us were initially amused by the device and used it to amuse ourselves by playing games, asking it to tell jokes, and terrifying our cats. (For the record, several of us have reported that our cats are officially desensitized).
Mike on using it for entertainment.
“That skill being enabled got my boys interested in exploring the Dot further.”
As we worked on this research as a team, many of us noticed how friends and family used Echo devices in their homes and we discovered that those who use it to control aspects of their home (e.g., lights, thermostat, etc.) use it regularly and find it very useful.
Katherine talking about automation.
“I use the Echo Dot at home for home automation.”
What didn't we like?
Timer Malfunctions and Lack of Timer Status
Timers, while useful, posed some usability challenges. One issue was the inability to see how much time was left on a timer ‘at-a-glance’. It’s great to be able to set the timer without using your hands, especially when multitasking in the kitchen. However, people also use the visual feedback of the timer counting its way down to understand when we need to be prepared to complete the next task or to know if we have time to complete another task before the timer goes off.
Katherine talking about timers.
“I realized something I was missing was some visual indicators.”
If the Wi-Fi connection is lost, the timer is lost along with it. Some of us who experienced this, found ourselves setting backup timers and eventually defaulted to using the approaches we relied on prior to the Echo Dot.
Rebecca talking about timers.
“I lose my WiFi connection, and the timer dies.”
In some cases, people didn’t realize that they could set multiple timers and that they can be labeled.
Too Many Voices
In a few cases, Alexa caused some sibling rivalry. In a household with three kids, when one kid would ask Alexa to play a song, then another kid would ask for a different song. You can imagine how this predicament ends.
It was also difficult to interact with Alexa if there were many people talking in the room. It feels awkward to ask your guests to be quiet so you can change the music or the room temperature.
Tony talking about the girls.
“When we first got the Alexa, my girls were really excited about it.”
Some of us found that Alexa regularly tried to sell us the thing we had just asked her about. We don’t want our home assistant to constantly try to part us from our money. Even if that is what Amazon would like her to do.
Mahtab on the cart.
“We could not get out of the loop.”
Remembering and Launching Skills
It’s easy to forget the trigger phrase to launch skills that you’ve downloaded. It’s also easy to forget the skills that you’ve downloaded—and without a visible interface, there is little findability (or ‘remindability’) of what you have. We all downloaded apps on our phones that we later forgot about. When you’re browsing for an app that you know you’ve downloaded, you inevitably come across an app or two that you’d forgotten about.
The Echo has a finite range. If you’re too far away from it, it won’t be able to hear you. Conversely, you need to have enough space between devices to ensure that both won’t hear you and respond.
In one case, one of us had one device in the kitchen and another in the living room and if both heard a request, they would both respond. Fun fact: they did not provide the same response.
John talking about Oscar yelling at Alexa.
“Getting more and more frustrated...”
Volume, Tone, and Accents
Alexa doesn’t respond to everyone equally. The higher voices of children, the accents of non-native English speakers, and soft-spoken family members all had a hard time getting a response, let alone the correct respose to their request or inquiry.
It was not uncommon for us to see someone in our family, usually a child, yell something to Alexa from across the room and get frustrated that she didn’t hear and respond. For one staff member, the Echo Dot never seemed to hear a family member who normally speaks very softly.
On occasion Alexa will speak without being triggered. You probably heard about the recent reports of Alexa laughing. Many of us experienced situations where Alexa would say something although no one used her trigger word.
The always on, always listening functionality made some of us a little uncomfortable. If not uncomfortable, we were clearly more aware of an always on, always listening device in our midst in a way that the same functionality on our phones isn’t as conspicuous.
Annette on the creepy factor.
“Shh, she’s listening!”
For a few of us, the device just doesn’t get used very often. Two of us actually unplugged and put away our devices to see if anyone in the household would notice. In both cases, it took quite a while for anyone to ask about it and when they did, no one was initially concerned that it was not available.
John on unplugging the device.
“It took 2 weeks for anyone to notice I unplugged it.”
Key considerations for voice-based UX
Voice as Input + Output
Overall, there are several challenges inherent in only having voice as an input and an output, which is the case with the Echo products that do not include a screen.
As an input device, there’s the issue of whether or not the device has correctly understood what you have requested.
If Alexa mishears or misunderstands what you’ve requested, you can’t really course correct. If Alexa has started down the wrong path, you have to ask her to stop and then make your request again. There doesn’t seem to be a way to say ‘sorry, I meant…’, like you would in natural conversation or go back a single step like you would if you were filling out a form and had incorrectly spelled a single field entry.
As an output device, voice has a variety of limitations. If you ask for a recipe, Alexa will read a list of options, which you have to remember and choose from. It’s much easier for sighted users to be able to see a list of options that they can scan before making a decision. It’s hard for a typical person to absorb and remember a full list of options. What is a digestible number of choices to provide when someone is only able to understand them by having the options spoken?
There are obvious accessibility challenges when you offer limited ways to interact with a device. A voice only interaction is a no-go for someone who is hearing impaired or has any kind of speech impediment. Conversely, a voice only interaction forces designers to think carefully about how to make a voice only output effective for those who are visually impaired.
It can be tricky to understand status using the Echo Dot. The blue light that spins around the top of the device when it first hears you, is the only ‘I’m working, standby’ indicators available. If the device doesn’t hear you, the only way to know is that it doesn’t respond.
In some cases, the light would spin and spin and then stop and Alexa would not respond, but you were given no information or feedback as to why the request failed. The visual-only approach for communicating status makes the device difficult for those who are visually impaired and requires you to look at the device in order to know that it has heard and is responding to your request.NEXT: When and how should you use voice-based UX?
When and how should you use voice-based UX?
The always on, always listening home assistant using voice interaction is a technology that is still in its infancy. As the technology evolves and NLP improves, there is tremendous potential for how it can be used. It’s one of many tools in a broader eco-system, not a stand-alone thing.
Before you launch a new skill for your product, be sure you understand how that skill fits into the broader ecosystem of access to your product or service. How will it add to or enhance something people are familiar with? What will be the benefit to the end-user of buying and learning something new?
In Concert with Other Input + Output Supports
As Amazon expands the Echo line, Alexa provides additional utility when paired with a screen for sighted users to get additional input and ‘at-a-glance’ references for things like the remaining time on a timer and a list of search results.
Allowing for other input mechanisms, such as using the device via your smartphone will allow someone to use the device when they have guests over and the device cannot pick out a single voice from among a group of voices.
Health Care + Elder Care
There are a variety of ways that a home assistant can support proactive health care with daily reminders to take vitamins or medications, to exercise, or to monitor chronic conditions.
In conjunction with other connected devices, it can allow people to live independently longer by encouraging and tracking medication adherence and providing easy or automated access to emergency services.
If a patient with dementia asks repetitive questions about past events or a specific topic, a human caregiver may become frustrated by the repetition. Alexa can respond repeatedly without frustration and take some of the burden off a human caregiver.
When integrated with home automation systems or other connected devices, Alexa and the Echo dot provide a lot of utility. Adding voice interaction to devices that someone already uses in their home is one way to encourage stickiness. Finding meaningful ways to add voice interaction to daily actions like turning on the lights, or changing the temperature allows people to get accustomed to interacting with devices via voice as well as touch or click.NEXT: Where should you begin?
Where should you begin?
Are you thinking about how your company should be offering your products or services through services like Amazon’s Alexa, Google Home, or other voice assistant services or simply considering how voice UX affects your business? There are a few things to keep in mind as you consider how to address voice-based interactions for your products and services. First and foremost remember that voice-based interactions are a tactic to accomplish an objective and not an objective in and of themselves.
Focus on a Single Meaningful Thing
As you consider how to integrate your product or service with devices that either entirely or primarily rely on voice as an input and output, try to focus on a single meaningful interaction that makes sense in that context and expand from there.
Reminiscent of the early days of mobile apps, it seems that many brands are putting out Alexa skills because they feel like they need to have something in this new space—but they haven’t put out anything thoughtful that keeps users coming back.
Find a key meaningful way that your customers can interact with you via their home assistant or mobile assistant and focus on making that interaction as useful as possible. You can add more from there.
Don't Try to Replicate Screen-based Information
The act of searching for information will need to change in a primarily or entirely voice-based interaction. Search needs to be considered differently when voice is the only output option. Reading a list of results that would normally display on a screen doesn’t allow users to scan options, determine if the search reached the target suite of options, and to choose one of the options.
In a spoken context, initial search results must be focused and finite, so the user can understand the options and explore in more depth if the initial suite doesn’t match their expectations.
Don't Forget: Confirmation Is Key
Confirming users’ inputs seems to be one of the missing components in the overall interaction. Adding a confirmation when making a request gives the opportunity to adjust if Alexa did not hear the request correctly. Doing this well would require that the confirmation follows the natural cadence of speech and allows the user to course correct if the result doesn’t match their expectations.
As you consider if and how voice interaction fits into your product or service eco-system, focus on how voice interaction provides utility and value over novelty.
Just like the smartphone, voice interaction is evolving rapidly, and it will be used in ways that we haven’t yet imagined.
08 - Nerdy research details
- Crux Collaborative’s seven staff members were each given an Amazon Echo Dot and a workbook with weekly feedback surveys and an incident log and asked to set up the device in their home.
- For three weeks, staff were asked to record their impressions of using the device by completing each weekly feedback survey and recording any details about specific interactions with the device that were unexpected or frustrating.
- Each staff member was interviewed one-on-one about their experience of interacting with the device.
- The Crux Collaborative staff meet as a group and discussed opportunities to improve the experience of using a voice assistant and the best applications for voice-based interactions.
Our objectives: What did we set out to learn?
- In what circumstances and for which types of tasks is it easy and intuitive to interact with a device via voice?
- How and when is voice interaction successful?
- How and when does voice interaction fail?
- How does the device or Alexa skill (a skill is similar to an app, but for the Alexa platform) support the user as they interact when there are no visual cues available (i.e. progress bars)?
- How do people of different ages interact with the device?
- What are the primary requests people make of the device? (i.e. set timer, read flash briefing, look up fact, control smart home device)
- What are the most effective approaches to help users learn to interact with information via voice commands?
- How does our role as User Experience professionals change when we are designing voice activated interfaces?