"Alexa, play Digger Digger"

David Kleeman

Bobby: “Alexa, play ‘Digger Digger’.”

Alexa: “You want to hear a station for porn detected. Porno hot chick amateur….”

Adults in the room: “No…no…NO NO NO ALEXA STOP!”

Ok, so maybe smart speakers aren’t yet entirely ready for kids to use them, as this exchange captured on YouTube demonstrates. Fortunately, the adults present remembered the magic words “Alexa, stop” before we (and little Bobby) found out just what XXX audio sounds like.

Still, Alexa, Siri, Cortana and other voice-controlled assistants are making their way into families’ homes at a rapid clip. Since fall 2017, the percentage of US children having access to a smart speaker has risen from a quarter or less to roughly half. Kids owning their own device is rising more slowly, but their expectation of getting one in the near future has turned upward across age groups in the last half year.

Those who have them are using them more than they did six months ago, and expectation of purchase among those who don’t now have one has turned upward recently.

The penetration of voice control is actually likely higher, as many families haven’t yet realized that their smartphones, tablets, remote controls and even some toys can be controlled orally. Remember shortly after tablets and smartphones first came out, when toddlers tried to swipe everything from TV screens to magazines (and if it didn’t respond, it was “broken”)? We’re not far from children expecting every object they encounter to listen and respond to them.

Acknowledging that smart speakers and their kin are going to be ubiquitous very soon, what do we know at present about how children and families are using them, about their frustrations (“Noooo!!!!!”), and about what they’d like to be able to do by voice? Dubit has been tracking the devices in its global Trends surveys, as well as visiting homes for in-depth conversations with kids and parents.

We went in with some assumptions:

  • that smart speakers would fast become common in kids’ lives (check);
  • that parents would view voice favorably and not consider it “screen time” (semi-check);
  • that voice would become the great “rebundler” of atomized media that left children frustrated trying to locate their favorite content across devices, platforms and a bouquet of remote controls (no check…yet); and
  • that voice control would be intuitive for children (so far, a minus instead of a check).
Parents, Smart Speakers and “Screen Time”

Parents were, by and large, satisfied with having their children using Alexa and its kin. Our screen time” hypothesis played out both in our own research and in mass media articles. This parent spoke to an AP reporter for a story titled Alexa, Read Me a Story: “When they hear without seeing, they have to make up visuals in their own heads…they have to be engaged and get more out of it.”

Yet, parents are anxious about some elements of smart speakers.

Parents of younger children felt the device didn’t understand their children’s speech. Some said it couldn’t decipher their regional accent, others noted that childish pitch and timbre skewed the result, and still more simply found that toddlers’ phrasing of their requests wasn’t aligned with the AI’s comprehension.

For parents of primary- and tween-agers, privacy and dependency came to the fore. One parent in our home interviews worried that using Alexa could blunt her child’s ability to research. More than one expressed concern about whether the microphone was always on and how data was used, citing stories of beyond-coincidental cross-platform advertising or content related to something spoken when the device was ostensibly off.

Many parents told us that they’re concerned about privacy, but not enough to stop using voice devices. There’s a long history in families and technology of concerned inaction: when the “V-Chip” was introduced, over 90% of parents wanted it installed in every television; when that happened, fewer than 10% actually activated the blocker.

Some parents do see an interesting potential for happenstance discovery in a crowded and often confusing media environment. A mother of a two noted that “sometimes it goes wrong, but that’s how they accidentally discover something new to try.”

The Rebundler

Someday, the vision may come true that any single voice device will help children find the content they want across platforms. Google, Apple and Amazon are working together on a “Connected Home Over IP” alliance to minimize “silos.” For now, though, there are still considerable barriers to seamlessness.

Each brand of smart speaker has its unique purposes, strengths and weaknesses; and most are designed to favor keeping users inside their own brand ecosystem. Alexa wants only to connect with Amazon’s music, video and shopping platforms. A family in Dubit’s research found that Apple HomePod was “basically useless without having AppleMusic.” At present, Google Home was the most flexible of all the speakers we tested.

Different brands, too, are programmed to respond to differently-worded commands, and there’s no user guide.

Is It Intuitive?

Related to the point just above, the biggest challenge for kids and smart speakers is that they are the first device in two decades that children can’t “hack.” The smartphone and tablet weren’t designed for kids, but young people quickly figured out not only how to use them, but how to master and adapt them. Content creators saw children’s engagement and began designing products for kids that now flood the app stores. They also began an iterative dialogue with the young audience, watching how kids made apps their own and tried unanticipated interactions. Smart designers made those into features, instead of treating them as bugs.

Smart speakers, by contrast, give very little useful feedback when a child’s request or interaction goes awry. “I don’t understand” don’t help the user formulate a more on-target query, or suggest other options, as is often built into search engines or children’s apps. The devices, too, use very literal semantics, so that a child asking “Play Peppa Pig again” will get “I can’t find ‘Peppa Pig again’.”

The growing number of speakers that also have a screen may be part of the solution, if there’s visual feedback to help users (at least those who can read) home in on a successful request.


Without really thinking about it, we’ll soon talk to all kinds of objects to get basic information – news, weather, reminders. At the same time, as smart speakers get wiser, we’ll use them more explicitly. They’ll become the go-to search engine, especially for young children who can’t yet spell, and an ever-ready “playmate” in both voice-only and digital/physical games, puzzles and adventures.

Voice will be a storyteller, a scorekeeper, a virtual opponent, the quizmaster, and an activator of toys and products. We’ll go beyond the current, rather literal uses (“Play ‘Digger, Digger’” or “tell me a joke”) and find the ways that speech and sound naturally integrate into a range of traditional play patterns. First, though, we need to work on those silo and intuitiveness challenges.

In a future post, we’ll look at what children and teens are doing with smart speakers, and where Dubit sees the most potential. There’s no question that these devices are here to stay, entering family life in both overt and subtle ways.

The slides in this article are drawn from a presentation given by Dubit at Kidscreen 2020. If you would like the full deck, please send a request to david.kleeman@dubitlimited.com.

Other Articles