On this episode of Build, host Maggie Crowley dives deep into conversation design with Google’s Head of Conversation Design Outreach, Cathy Pearl.
So what is conversation design exactly? Think teaching computers to communicate like humans (not robots) across voice interfaces and via typing, swiping and tapping. Cathy shares why now is the time to invest in conversation design and where and how to get started. Tune in for more from Maggie and Cathy.
Subscribe & Tune In
In This Episode
0:12 – Cathy Pearl, Head of Conversation Design Outreach at Google
0:30 – How Cathy got into this field
2:12 – Smart speakers and a boost of energy
3:23 – What is conversation design?
3:42 – Teaching computers to communicate like people, not the other way around
5:46 – Problem of brushing past design
6:07 – Grice’s Maxims
8:11 – Customer frustration in conversations
8:51 – Proper communication with the user and setting the stage
9:37 – On feeling understood and the problem of loops
10:40 – How should people use conversation design?
12:28 – What makes a good conversation
13:35 – Don’t be fooled by the number of turns
14:19 – No matches or no input
15:14 – Dropout funnel and retention
16:13 – What does the future look like?
17:34 – Silent speech
18:44 – Personalization
19:21 – Career lessons Cathy has learned
21:41 – The value of openness in one’s career
Maggie Crowley: So we’re back. Welcome to Build. This is Maggie. Today I’m really excited selfishly because I have Cathy Pearl. The Head of Conversation Design Outreach at Google. Cathy has a deep background in user experience. She was even a software engineer at NASA and I’m super excited to get her take on the emerging space of Conversation Design as someone who works in that space as well. So Cathy, welcome.
Cathy Pearl: Thank you.
Maggie: So I just wanted to start by, take us through how you got into Conversation Design and how you figured out that that was even a career that you could have.
Cathy: Conversation Design was definitely not a thing when I was young or even when I was in college. I was very interested in computers when I was young and we got our first family computer and I learned how to program. And I was always really interested in trying to get the computer to talk back to me. In college, I majored in cognitive science, which I didn’t know it at the time, but that turned out to be a great background for our Voice User Interface Designer because fundamentally, cognitive science is about studying how humans work, had psychology, neuroscience, linguistics, artificial intelligence and all those things really play into being a good Conversation Designer. In graduate school, I majored in computer science and I took my first HCI class, Human Computer Interaction, which really opened my eyes to the idea that rather than just focusing on like what cool technology can we build and then, well, if the person has to read a 50 page manual to use it, well, so be it. It was more about, how can we create technologies that already work in a way that somebody might naturally expect?
So something like Voice User Interface fits in really well with that because we already all know how to talk. And then my first job in the industry was at a place called Nuance Communications where they were the first company to create IVRs phone systems with a system where you could call the computer and get stock quotes. I spent about eight years there really learning the fundamentals of how to create these conversational systems, but I really grew kind of disillusioned with the technology because so often we were told by our clients, human operators are very expensive, so your role is to keep people away from speaking to our operators. And I was thinking of really leaving the space entirely, but I was really re-energized when smart speakers came out and I realized there was this very fundamentally different use case in how we could interact with this technology. And I’ve gotten excited about it. Again-
Maggie: Smart speakers being Amazon Alexa?
Cathy: That was the first one. Right. Yeah.
Maggie: One thing that I think is really interesting is that when I think about the phone-based ones, we think about those awful airline things that you call into and you just find yourself saying operator 100 times until someone finally picks up the phone, or I always just press zero until someone actually talks to me because I feel like they’re so frustrating to work with.
Cathy: Yeah. And that was absolutely true for a lot of them. That being said, there were a few that I worked on that I still feel really proud of. My favorite was one in the San Francisco Bay Area called 511 where you could call 511 to get traffic information and public transit information. And this came out before smartphones. So if you were on the road, you had no way of looking at traffic and things like that. So it was offering something that you couldn’t do in another way. It was more of a value add and that’s something that still gets millions of calls and I’m a very proud of.
Maggie: Right. So then things like Amazon Alexa came around and that was sort of reinvigorated your interest in this space and what we could do with it?
Maggie: Okay. So then what is Conversation Design and sort of what does that industry even really look like? I think I’m familiar with the side of making bots and making conversation flows in that manner, but when you think about Conversation Design in general, sort of what are the principles of that?
Cathy: Conversation Design, I mean, the key component is really teaching computers how to communicate like humans and not the other way around. And communication by the way, we often talk about Voice User Interfaces, but it includes typing and even includes potentially swiping and tapping because all those things are part of this back and forth turn-taking you might be having with the computer, and Conversation Design is the design practice for this. Just like at Google, we have material design, building visual technology, and just I like to think about when websites first came out and anyone could create a geocities website and have flashing construction signs and all that kind of stuff. Nowadays, if you’re a company and you want to have a great website, you’re going to hire a professional web designer. And just like that in this relatively new space of building Conversation Designs, people really need to invest in hiring a conversation designer or training someone to have that skillset because it really is a skill and what’s happening right now is that everyone’s very excited about this technology and kind of going crazy with building things, but a lot of times, you just throw a couple of developers at it, try and get something out the door and people don’t use it or it’s frustrating.
And often this is because of best practices with design were not followed and it becomes frustrating or unsatisfying experience for the user.
Maggie: Right. Because I think it sounds like one of those things that, because like you mentioned earlier, we all know how to talk, most people do. You think that you would be able if you sat down and you said to yourself, “Okay, I need to design this Conversation.” You might not even think about it from a design perspective, you might just say, “Oh, I need to figure out what my boss going to say from our side.” And then of course you should be able to do that because you’re a human and you talk to other humans all the time. So how hard could it be?
Cathy: Exactly. And there’s this great myth that’s always been the case with designing these conversational experiences where you think, “Well, I know how to conversation, so how hard can it be?” But just the other day, I was playing a game on my Google Home, an adventure game and it’s a pretty popular one. And it asked me a “yes, no” question. And I said sure. And it didn’t understand me. And it surprises me that we’re still encountering things like that. And again, I think it’s because of this fundamental issue where design, as in many cases is often brushed past, and like, “Oh, we’ll fix the words later”, which Conversation Design isn’t just about sprinkling in some words, it’s about the whole flow and construction and how do we build it so that people know what to say.
For example, when you think about, you were talking about human communication, we think a lot about how humans communicate when we build these things, we think about things like Grice’s maxims, so Paul Grice came up with this cooperative principle. This was in the ’70s, this was about human conversations, but typically in general, when I’m having a conversation with somebody, I’m trying to be cooperative. That is, I’m trying to understand what they’re saying and I’m trying to speak in a way that makes sense to them. He has maxims like the maximum quantity, meaning don’t say too much, don’t say too little, but the thing about human conversation is that we have so many nuances that the computer doesn’t have. For example, we have body language, eye gaze, pauses, and in the current technology, we have a much narrower channel of signal from the person who’s speaking.
Thinking about something as simple as positive, so we’re having a conversation. The pause between our turns is super short. It’s like the blink of an eye, like 200 milliseconds. If I ask you a question and you pause for longer than a second, that’s a signal to me. So for example, if I said, “Hey, can you give me a ride tomorrow?” And more than a second ticks by which isn’t that much time. I’m like I don’t think they want to give me a ride. That’s the kind of signal that we can’t necessarily capture and in today’s current technology. So therefore, we have to be even more careful about how we guide it. For example, every time the computers turn finishes, we need to have a call to action by either giving an instruction to the user or a question. So saying, tell me your favorite color, or what’s your favorite color?
And then it’s really clear to the user, oh, it’s my turn to talk. And one of the things I see people do all the time on Voice User Interfaces is they’ll ask a question and then give some examples. So they might say, “What’s your favorite color?” You can say yellow, red, or green, and that messes people up because they start to answer the question and then the computer keeps talking and they stop talking and it’s this messy back and forth. So again, these are the kinds of things that a Conversation Designer thinks about and those are going to be issues that they have to worry about when creating these designs.
Maggie: Right. And I think one of the things that I’ve come across that I was really surprised by when I first started seeing this industry was how frustrated customers and just people in general can be when their expectations aren’t properly set at the start of one of these conversations. Right? So if you don’t know that you’re talking to a bot or you expect that whatever it is that you’re talking to works in a certain way and it doesn’t, especially if there’s no visual component, I’m sure it’s probably maddening to try to communicate.
Cathy: Yeah. That is one of the most challenging things as well. So as you said, one of the things that’s really important, whether you’re doing like a text bot or a Voice Interface, is to set the stage that the expectations because if you’re building an action or a chat bot or something, it will only be able to do a limited set of things and that’s okay as long as you’re able to communicate that to the user. So if you start with some open-ended prompt like, I’m the such and such bot, how can I help you? You’re in for a world of hurt because people have no idea what they can and can’t say. So you have to really set the stage and kind of outline these are the types of things that can help you with.
And another thing I always tell people is that one of the most important things for people when we’re talking to one another is to feel understood and acknowledged. And so if you say, let’s say you built a hotel chatbot for booking hotel rooms. And a lot of people are saying, “Well, I want to book a car.” If you just say, “Well, I don’t understand”, that’s like super frustrating to people, but you could say, “Oh sorry, I can’t book cars yet. I can book hotels.” People like that a lot better than just being like, you don’t understand. I don’t understand. If you really want to feel like you’re at least being understood even if you can’t have your problem solved right at that moment.
Maggie: Right. And I’ve also seen people get frustrated when they don’t know how to get out of it. So they get kind of stuck in a backflow or someone will try to build a flow and those put in loops because they just want the customer to kind of get stuck in there, whatever they want the Bot to do or qualifier or whatever it does, and then the customer’s like, “But I don’t want to do that. I want to do something else, but I can’t. I can’t find that edge of the Bot where it tells me how to get out of it.”
Cathy: Exactly. Yeah. I was at the gas station yesterday. This was not a chatbot, but I accidentally hit the wrong button and it wanted me to enter an amount and there was no way I could cancel or get out of it and I was so frustrated and just like you said, with chatbots, the same thing. You get into this place and you don’t let the user extract themselves gracefully to move on with things.
Maggie: Right. So another question I want to ask is, I think this probably applies to lots of these sort of new shiny technologies that are coming out and everyone kind of like AI, they want to use AI, they want to put it on their website that they have it. How do you think people should use Conversation Design and where they should put conversations versus where it’s not helpful?
Cathy: You bring up a really good point. I think right now in that AI is this buzzword and everyone thinks you have to have AI to have a successful Conversational system, which it’s certainly something to strive for and if we want to have truly generalized chat situations, of course you’ll need that. But I think some people forget that you can have very effective, important conversational systems without a lot of AI. And the way you do that is you just decide on your goals very carefully and you pick a particular domain or a particular task and you just make sure that that particular task is done really well. And one way we like to advise people is if you’re thinking about should I build a conversational system or voice system for what I’m trying to do? One thing to think about is, is this something people naturally have conversations about already? Like do I call a concierge or a bank clerk or could I have a conversation about it with another person? That’s probably a good fit.
Thinking about efficiency. If I do this through voice, is it actually faster than say picking up my phone and tapping on things. If it’s not faster, why would people want to do it? Another thing I think people often forget about is the complexities surrounding getting to the use case. By that I mean let’s say you want to help them with banking or would their car or something like that, but they have to log in somehow. You have to spend time thinking about how can we make that a seamless experience for the user because if the logging in portion is so complicated, it doesn’t matter how great your action or whatever is because it’ll be too hard for people to get to. So a lot of different aspects like that go into it.
Maggie: Right. So then, okay, we’ve figured out the right place to put this conversation or used conversations in a product and this is something we were kind of talking about before. How do you know what makes it good conversation? How do you sort of measure that? I think when we first started talking about it, way is to think, well, you reached the end, like that’s a good conversation because you’ve got to the end of the conversation, but I think the more I learn about, the more nuanced I think this is. And so what’s your take on what the characteristics of a good conversation are?
Cathy: I think that’s a great question. That’s not been 100% decided, but there are certainly things we can think about specifically when looking at successes and failures of Conversational systems. One of course is, did the user achieve their intended goal? And the goal is such a variety of things. It could be a one off like turn on the light, play a song. It could be a more complex back and forth about looking for a hotel room in New Orleans or something. It could even be Chitchat. That could be a goal. Like I want to have a short little pleasant conversation with the system. So truly understanding the goal and not being too vague about the goals is very important. But of course in addition, did they achieve their goal? Was the process pleasant and efficient? So maybe I booked my hotel, but boy I was pulling my hair out. That’s not great.
Another thing I want to caution people is not to be fooled by number of turns. Back in the phone system world we often looked at like what was the amount of time someone’s spend in the system? But number of turns does not necessarily reflect a good or a bad experience. A lot of people I think are focused on, we have to keep it as short as possible and that’s the best. And that’s not necessarily true. Sometimes breaking down a complicated question into two questions is actually a better experience. And what we’ve found is that as long as people feel like they’re getting somewhere towards their goal and that the questions are not relevant or repetitive, people are happy to have longer conversations. And so again, just looking at number of turns is not necessarily the right way to go, but in terms of like really specific metrics, two things we absolutely look at are no matches and no input.
So no matches or when we get unexpected user responses, it’s something we couldn’t handle and it’s really important to see, well, how many of those are you having and where in the conversation are those occurring? It could be because your grammar coverage isn’t good enough. Like the example I gave earlier when I said, sure instead of yes, that’s a very obvious one, but it could be just even thinking about how many different ways people can set an alarm. I might say set a timer for 8:00 AM tomorrow morning. I didn’t even say alarm, right? But should be able to understand that. So really spending time looking at all those matches and figuring out is it grammar coverage, is the prompt confusing, is the flow confusing, all that kind of things. And similarly no inputs, which is when someone doesn’t say anything. If you have a particular place where there’s a lot of no inputs, you’re probably asking maybe a confusing question, like maybe you’re asking them about information they don’t have easy access to or something like that. So it’s a good place to look at.
There’s also the dropout funnel. Like let’s say you’re building a conversational experience like a symptom checker. You expect to be 15 questions and people are dropping out often at like question eight or nine. Obviously you’ve got to dig in there and figure out what’s going on. Why is there a dropout rate? And then retention, although again, I caution people retention. Is retention relevant to your particular thing? It could be something that someone only does once and that was a success or are you building something where you expect people to come back? And so that could be another thing to look at.
Maggie: Yeah, I think there’s really interesting parallels between what you’re mentioning and the way that people have started to approach onboarding flows and activation flows into products because I think there was that original hypothesis that was the faster, the shorter the better, but then similar to the conversation, if you break it down into smaller pieces and you show progress, I’ve seen those examples where you actually get more people through your flow because you’ve made it simple. They sort of get invested and they move further down that flow.
Cathy: Yeah, that makes sense.
Maggie: Yeah, so then what does the future look like as someone who’s in the center of Conversation and Conversation Design? Obviously there’s lots of new interfaces coming out. There’s the Google Home, there’s Amazon Alexa, so where are we going? What’s going to be the next big thing that comes out in Conversations?
Cathy: Of course, this is just my personal speculation, but I just read this really great report that came out from AnswerLab. They did a diary study for people who have smart speakers. And one of the things they mentioned is that users as they have these digital assistance over time, they want more fun and engaging conversation. So I think we started out very utilitarian, like set a timer and play music. That’s all we do and more and more things have been added, but I think as time goes on, people are going to be looking to have longer and more engaging conversations about a variety of topics and look at something like showers, which is the very popular chatbot in China and some people have 20 minute conversations with that thing. So I think that speaks to a need that we want to have a little bit more fun and engaging experiences.
Another thing I think will happen is conversations in more places. So a lot of people have smart speakers sitting in their kitchen or in their home, but I think there’ll be more places where maybe you’re at the grocery store or I’m at the mall and you’re talking to a kiosk or the shelf to ask questions and things like that. So there’ll be more places available to you to have these conversations. But as part of that, one of the technologies that I’m really interested right now as this idea of silence speech, which is where you can speak, but no sound was coming out of your mouth. You’re kind of making the movements without actually vocalizing.
Maggie: Oh, interesting.
Cathy: There’s some places like MIT Media Lab and NASA are working on this technology. Basically it uses sensors along your job onto to get the bone conduction to understand what you’re saying. And it’s still in early stages, but I think this will be critical for voice to become ubiquitous because a lot of us don’t want to be talking out loud to our computers, we’re out about it. But not even, I mean, sometimes because we feel silly, sometimes it’s because it’s private information like my health information, but it’s also just practical if I’m in the office in a shared space and we’re all talking to our computers, that’s really annoying.
Maggie: So I’m just imagining the example you brought up of the grocery aisle just sort of late at night, walking down an aisle, sort of talking to a shelf and having some sort of like existential crisis.
Cathy: Exactly. So I think technologies like that will come about that make it more natural. People will be more likely to use a speech in public, things like that happen. And one more thing I’ll say is I think there’ll be more personalization. So for example, right now, if I asked to play the song Last Christmas, I want to know that I want the wham version and not the Carly Rae Jepsen version. It should know a few things with my permission, it should know a few things about me like that that will just make it a little even more frictionless and more enjoyable.
Maggie: Right. Awesome. Well, this has been super interesting. I really appreciate you coming on and taking some time, but before I let you go, I have a couple more questions. Just curious about over your career, you’ve gone from being a software engineer to being in design and now you’re sort of focused on this Conversational Design Outreach. What are some lessons that you learned sort of over the arc of your career and where you are right now that you think might be helpful to people who are just getting started in product?
Cathy: Oh gosh, that is a big question. I tried to steal the secrets from everyone I’ve talked to you to see what I can get for myself. I think for me … Gosh, I’m not sure if this is what you’re asking, but at a personal level, for me, I kind of hit this crisis point in my career where I had taken some time off when my son was born and I was in my 40s and I felt that I was kind of done with my interesting career and I thought, well, no one’s going to … I had worked full time and like seven years and I just thought, no one’s going to hire me again. I’m not going to have an interesting career again. And I was so wrong, but I really felt like I was in that pit, I couldn’t see out of it.
And I think maybe being able to recognize that nowadays, I think people’s careers often go through many changes. I started out as a software engineer and I kind of after a few years realized that I was really more interested, I love programming, but I was more interested in the design side of things and realizing that that could be a career. I think what’s happened for me is that a lot of times I haven’t even realized what is going to be up there in the future in terms of interesting things to do and just sort of keeping an open mind and trying some things that really scared me or I thought … like writing my book, when that was happening, I just thought, I don’t know, I can’t do that. That’s like I’m working full time at a startup and I’m a parent and there’s no way I can do that. Um, but I said yes anyway. And it was hard for sure, but I got through it and again, everyone’s circumstances are different and there’s reasons you can’t always say yes to things, but for me I’m saying yes to a bunch of stuff that kind of scared me. I ended up really being beneficial and opening doors that I didn’t even know were there and resulted for me in some really nice opportunities in my career.
Maggie: Awesome. Yeah, I love that advice. I think I had a conversation with someone who’s probably six months out of Undergrad today and he was asking, how do I set better goals for my career? How do I know where I should be in the next however many years? And I think what I’m hearing over and over from people is that it’s really hard to plan that out. And what you have to do is be open and like you’re saying, say yes and sort of just be open to new opportunities and take risks, especially when they scare you, because that’s usually a good sign that it’s going to challenge you and help you grow and learn.
Cathy: Yeah. And on top of that I would say, and again, totally, I know different people have different circumstances and not everyone can change jobs and things like that. But I think sometimes people and you’re in a particular role or something and you think, this is where I have to be, it’s like safe or whatever and maybe you’re happy. I don’t see another opportunity. And so sometimes being willing to maybe take the leap and try a different job, which may or may not be better than the job you have, but being willing to maybe try something because nothing is permanent these days. It’s nice to experience different things that kind of help you come to realize what are the things you enjoy most about a job, and if you have the opportunity to be choosy, it’s nice.
Maggie: Awesome. Well, thank you Cathy. I really appreciate you coming on this episode. Everyone who’s listening, please give Cathy a shout-out in the reviews. Obviously five stars only, and thanks for coming on the show.
Cathy: Thanks so much for having me.