|
|
@@ -6,7 +6,7 @@ has_experiments: ["living-with-a-black-box.md", "imagine-a-machine-listening-uto |
|
|
|
|
|
|
|
## Alexa, what is machine listening? |
|
|
|
|
|
|
|
"Machine listening" is one common term for a fast-growing interdisciplinary field of science and engineering which uses audio signal processing and machine learning to "make sense" of sound and speech.[^Cella, Serizel, Ellis] Machine listening is what enables you to be "understood" by Siri and Alexa, to Shazam a song, and to interact with many audio-assistive technologies if you are blind or vision impaired.[^Alper] As early as the 90s, the term was already being used in computer music to describe the analytic dimension of ['interactive music systems'](https://wp.nyu.edu/robert_rowe/text/interactive-music-systems-1993/chapter5/),[^Rowe] whose behavior changes in response to live musical input, though there are ![precedents](audio:static/audio/maier.mp3),[^maier_audio_1] even before that. Machine Listening was also, of course, a cornerstone of the mass surveillance programs revealed by Edward Snowden in 2013: SPIRITFIRE's "speech-to-text keyword search and paired dialogue transcription"; EViTAP's "automated news monitoring"; VoiceRT's "ingestion", according to one NSA slide, of Iraqi voice data into voiceprints. Domestically, machine listening technologies underpin the vast databases of vocal biometrics now held by many [prison providers](https://theintercept.com/2019/01/30/prison-voice-prints-databases-securus/ "Prisons Across the U.S. Are Quietly Building Databases of Incarcerated People’s Voice Prints") and, for instance, the [Australian Tax Office](https://www.computerworld.com/article/3474235/the-ato-now-holds-the-voiceprints-of-one-in-seven-australians.html "The ATO now holds the voiceprints of one in seven Australians"). And they are quickly being integrated into infrastructures of development, security and policing. |
|
|
|
"Machine listening" is one common term for a fast-growing interdisciplinary field of science and engineering which uses audio signal processing and machine learning to "make sense" of sound and speech.[^Cella, Serizel, Ellis] Machine listening is what enables you to be "understood" by Siri and Alexa, to Shazam a song, and to interact with many audio-assistive technologies if you are blind or vision impaired.[^Alper] As early as the 90s, the term was already being used in computer music to describe the analytic dimension of ['interactive music systems'](https://wp.nyu.edu/robert_rowe/text/interactive-music-systems-1993/chapter5/),[^Rowe] whose behavior changes in response to live musical input, though there are ![precedents](audio:static/audio/maier.mp3) even before that.[^maier_audio_1] Machine Listening was also, of course, a cornerstone of the mass surveillance programs revealed by Edward Snowden in 2013: SPIRITFIRE's "speech-to-text keyword search and paired dialogue transcription"; EViTAP's "automated news monitoring"; VoiceRT's "ingestion", according to one NSA slide, of Iraqi voice data into voiceprints. Domestically, machine listening technologies underpin the vast databases of vocal biometrics now held by many [prison providers](https://theintercept.com/2019/01/30/prison-voice-prints-databases-securus/ "Prisons Across the U.S. Are Quietly Building Databases of Incarcerated People’s Voice Prints") and, for instance, the [Australian Tax Office](https://www.computerworld.com/article/3474235/the-ato-now-holds-the-voiceprints-of-one-in-seven-australians.html "The ATO now holds the voiceprints of one in seven Australians"). And they are quickly being integrated into infrastructures of development, security and policing. |
|
|
|
|
|
|
|
![Automatic speech recognition](audio:static/audio/kathy-reid-intro-to-ASR.mp3),[^kathy_audio_1] transcription and translation {{< nosup >}}[[i](https://www.statnews.com/2020/05/22/ai-startup-transcribes-annotates-doctor-visits-for-patients/ "AI startup transcribes and annotates doctor visits for patients"), [ii](https://www.iflytek.com/en/products/#/Home "iFlyTek: Create a better world with A.I."), [iii](https://www.wired.com/story/iflytek-china-ai-giant-voice-chatting-surveillance/ "How a Chinese AI Giant Made Chatting—and Surveillance—Easy")]{{< /nosup >}} - |
|
|
|
targeted key word detection {{< nosup >}}[[i](https://theintercept.com/2015/05/05/nsa-speech-recognition-snowden-searchable-text/ "How the NSA Converts Spoken Words Into Searchable Text")]{{< /nosup >}} - |
|
|
|