![](https://beehaw.org/pictrs/image/8b06f782-1aba-4826-b76a-d1468686a441.png)
![](https://lemmy.dbzer0.com/pictrs/image/a18b0c69-23c9-4b2a-b8e0-3aca0172390d.png)
1·
10 days agoIntellectual property as a concept ultimately stifles progress every time it’s been tried. Information wants to be free, and we prosper far more when we accept that reality.
Everyone should read Against Intellectual Monopoly by Michele Boldrin and David K. Levine. It’s on David’s website, Internet Archive, Anna’s Archive, and various bookstores. Feel free to buy or print some copies and distribute them to your favorite people, libraries, bookstores, and congress critters~
It’s a technique called Keyword Spotting (KWS). https://en.wikipedia.org/wiki/Keyword_spotting
This uses a tiny speech recognition model that’s trained on very specific words or phrases which are (usually) distinct from general conversation. The model being so small makes it extremely optimized even before any optimization steps like quantization, requiring very little computation to process the audio stream to detect whether the keyword has been spoken. Here’s a 2021 paper where a team of researchers optimized a KWS to use just 251uJ (0.00007 milliwatt-hours) per inference: https://arxiv.org/pdf/2111.04988
The small size of the KWS model, required for the low power consumption, means it alone can’t be used to listen in on conversations, it outright doesn’t understand anything other than what it’s been trained to identify. This is also why you usually can’t customize the keyword to just anything, but one of a limited set of words or phrases.
This all means that if you’re ever given an option for completely custom wake phrases, you can be reasonably sure that device is running full speech detection on everything it hears. This is where a smart TV or Amazon Alexa, which are plugged in, have a lot more freedom to listen as much as they want with as complex of a model as they want. High-quality speech-to-text apps like FUTO Voice Input run locally on just about any modern smartphone, so something like a Roku TV can definitely do it.