Since the time I was a child the idea of robotics and automation has appealed to me. When digital assistants started to hit the scene I was stoked. I imagined myself controlling my home like Tony Stark with Jarvis. I tried to hold out for a good Siri solution, knowing that Apple makes quality products and that they are generally more privacy focused. But $400 is way too high for a smart speaker and Siri, bless her heart, is way behind the rest of the class in terms of intelligence. I ended up going with Amazon’s Alexa and have really enjoyed the experience.
At this point in time my investment in the Amazon Echo ecosystem is not insignificant. We are the owners of the full-size Echo 2, an Echo Spot (the alarm clock one), and the 2nd generation Echo Dot (the hockey puck). I even had a second Dot, but have since gifted that to my in-laws. Added up that’s a decent dollar investment (but still hasn’t reached the price of one Apple Homepod). With these devices I’ve been able to have music that plays throughout our house, control a number of lights by voice, have a simple intercom system, and more. An additional perk is that it lets my kids easily do things like turn on the lights in the scary basement from upstairs.
All of that to say, I have good reason to consider the privacy implications of digital assistants. And it could also be painful for me should I need to make a change. So you know I’m taking this seriously.
What Is The Worry?
After the initial rush of excitement surrounding digital assistants and their potential, a number of years have passed and important questions have come up in the mean time. Many people ask, if something is always listening for me to speak is it recording everything I say? Secondly, what is done with the recordings once your choice of tech companies has it? How long do they keep it? Perhaps most important recently is the question of human interaction. We always assumed that only machines interacted with our recordings, but it turns out that people do as well.
Some of these concerns are valid and some simply aren’t. Let’s start by taking a look at how these devices work, and that will help us separate out truth from myth.
How It Works
Honestly, it’s really very simple. In the case of smart speakers, or anything that uses a wake word (Alexa, Hey Siri, Ok Google), the device listens to everything said around it. This happens locally on the device, and it completely ignores anything other than the wake word. When the wake word is recognized, that is when it reaches out to the company’s servers. Any sound during the listening period is recorded and sent up to these cloud servers. That is where all of the smarts are (which is why the devices can have such low power hardware). Using sophisticated language processing, machine learning, etc their systems determine what you’re saying and how to best respond. Any number of things are triggered from there, and whatever you requested happens. That’s it. To summarize: Wake word, snippet sent to cloud, action taken. The companies then take a small percentage of the millions of recordings they receive, somewhat disassociate it from user accounts, and have people work with them to make their language processing smarter.
There are a few key takeaways from this knowledge:
- The device dumps EVERYTHING it hears unless the wake word is registered. It is NOT recording everything you say.
- The request is processed by machines. No human interaction is involved.
- Though people are involved in training the system to be smarter, and that allows them to hear recordings, it is with an astoundingly small portion and your user account is partially decoupled from the recordings at that point.
- It’s worth noting that Apple is more privacy conscious with its operations, such as tying recordings to a random identifier rather than your user account and doing as many operations locally as possible rather than in the cloud.
What Concerns Remain?
Having said that, it does not mean there aren’t real concerns. In our experience the devices can wake at very random times where we definitely did not say “Alexa”. At first this is just annoying (and a little creepy). The more you dwell on it, though, you start to wonder exactly what it’s picking up. Some have raised the point that personally identifiable information could be overheard, or bank numbers, etc. Sometimes I wonder if it picks up my children’s conversations. Not that they’re saying anything nefarious, but I don’t really want them recorded. Also, you just never know how anything you say will be taken if heard out of context. In today’s society people get really worked up about opinions they don’t agree with. Who’s to say it won’t overhear a conversation that isn’t politically correct, and then the workers might tie it back to you? What if something is acceptable now but in a decade is practically a thought crime (token 1984 reference)? One should always be careful of what they say, but this adds newer and potentially more dangerous considerations. Words you barely thought about almost instantly become data that is globally distributed and perhaps perpetually retained.
Most companies have built mechanisms to let you delete recordings from your account, but that doesn’t mean they’re completely purged. In the end it’s really up to them how long they’re going to keep the data and what uses they’ll have for it. The recording is on their servers and out of your hands.
After the outcry over having people listen to recordings most of those programs have been suspended. I can’t imagine that will last, though. I’m honestly not certain you can properly train the system without sampling recordings. That’s just the nature of how this technology works. They should have disclosed this better and potentially had their employees act more professionally (like not passing around amusing recordings), but it’s simply a reality.
Considerations For Balance
How much all of this matters to you is going to be a personal choice. Everyone has different thresholds for what they consider to be private conversation and how much they care they’re overheard. In many areas of life my wife is my touch point to reality. I can get very lost in the internal academic debate and become completely disconnected from the real world. While wrestling with these subjects in regards to our own home I asked if it bothered her, and she responded something to the effect “not even once”. Just now I asked her what she thought we should do and she said “I don’t even care”.
Another thing to consider is how deep down the rabbit hole you want to go. If you’re worried about devices listening to you without your permission, stop to consider the ones that are already part of your daily life. Every cell phone (smart or not), laptop, smart watch, and many desktops have mics in them. A bad actor could activate any one of those without your knowledge. It’s already been done before with webcams. Facebook, Google, and likely the goverment already have an astounding amount of information on you from multiple sources, verbal or not. And you could just mute or turn off the devices should you need to have a private conversation.
When you put everything together it’s a balance of obtaining the functionality you want vs the information you volunteer. For me that means, as painful as it is, Alexa and I will have to part ways. I really didn’t want that to be the case, and I even wrote up this article originally stating that I was going to keep her. But that didn’t sit well, and here are the main reasons why:
- Conversations unintentionally overheard: There is no way around it, these devices often think you’re speaking to them when you’re not. That leads to us being recorded when our guard is down, and as stated before the information is then out of our hands and within Amazon’s control. For whatever reason this seems to happen much less with Siri, so I’m not as concerned about our phones, etc having it enabled.
- People reviewing recordings: Only Apple’s system truly decouples your recordings from your personal information, so theirs is the only one I’m comfortable with in this regard.
- There is no feature we can’t live without: We did a week trial without the devices and there was surprisingly little impact. A few smart home tasks are more annoying to do with the phone or watch, but overall life went on pretty much the same. Having the Jarvis effect is fun, but not something I’m willing to trade our privacy for.
- No peace of mind: I mentioned this in the web browser discussion too, but peace of mind is very valuable. No matter how much I run over the facts and come to terms with them, something in the back of my mind was never comfortable with Alexa. Being without that internal battle during our week long test was refreshing, and a part of me has known since starting out with Alexa that something felt off about it. Maybe it’s paranoia, or maybe it’s instinct. Only time will tell I suppose.
So going forward the only digital assistant in our lives will be Siri. She’s not the smartest, and the Apple Homepod is absurdly expensive. If they go on sale or Apple produces a lower cost Echo Dot competitor then I’ll jump on board. In the mean time we just use our phones or watches for the same tasks. And if worse comes to worse, we walk across the room and hit a physical button. Turns out that’s still a completely viable option.
A few weeks have gone by since I posted this article and I’ve now changed my position on our Alexa devices. This is because of a couple of key reasons. First, Amazon will let you opt out of having humans review your recordings. Second, I thoroughly reviewed my own article above and concluded that my decision leaned more on the paranoid side than the cautious.
TLDR, all of the information above is still true and valid. But the value vs risk index tips in favor of us keeping Alexa rather than dismissing her, especially as Amazon comes under increasing pressure to make sure she guarantees our privacy. As noted above, my family has received a ton of benefit from using these devices (my children have literally been begging me to put them back). We feel comfortable with the way the technology works and the benefits we receive from using it.