"Machine listening" is one common term for a fast-growing interdisciplinary field of science and engineering which uses audio signal processing and machine learning to "make sense" of sound and speech.[^Cella, Serizel, Ellis] Machine listening is what enables you to be "understood" by Siri and Alexa, to Shazam a song, and to interact with many audio-assistive technologies if you are blind or vision impaired.[^Alper] As early as the 90s, the term was already being used in computer music to describe the analytic dimension of ['interactive music systems'](https://wp.nyu.edu/robert_rowe/text/interactive-music-systems-1993/chapter5/), whose behavior changes in response to live musical input.[^Rowe, Maier: 10.32-16.57] It was also, of course, a cornerstone of the mass surveillance programs revealed by Edward Snowden in 2013: SPIRITFIRE's "speech-to-text keyword search and paired dialogue transcription"; EViTAP's "automated news monitoring"; VoiceRT's "ingestion", according to one NSA slide, of Iraqi voice data into voiceprints. Domestically, machine listening technologies underpin the vast databases of vocal biometrics now held by many [prison providers](https://theintercept.com/2019/01/30/prison-voice-prints-databases-securus/ "Prisons Across the U.S. Are Quietly Building Databases of Incarcerated People’s Voice Prints") and, for instance, the [Australian Tax Office](https://www.computerworld.com/article/3474235/the-ato-now-holds-the-voiceprints-of-one-in-seven-australians.html "The ATO now holds the voiceprints of one in seven Australians"). And they are quickly being integrated into infrastructures of development, security and policing.
"Machine listening" is one common term for a fast-growing interdisciplinary field of science and engineering which uses audio signal processing and machine learning to "make sense" of sound and speech.[^Cella, Serizel, Ellis] Machine listening is what enables you to be "understood" by Siri and Alexa, to Shazam a song, and to interact with many audio-assistive technologies if you are blind or vision impaired.[^Alper] As early as the 90s, the term was already being used in computer music to describe the analytic dimension of ['interactive music systems'](https://wp.nyu.edu/robert_rowe/text/interactive-music-systems-1993/chapter5/), whose behavior changes in response to live musical input.[^Rowe, Maier: 10.32-16.57] It was also, of course, a cornerstone of the mass surveillance programs revealed by Edward Snowden in 2013: SPIRITFIRE's "speech-to-text keyword search and paired dialogue transcription"; EViTAP's "automated news monitoring"; VoiceRT's "ingestion", according to one NSA slide, of Iraqi voice data into voiceprints. Domestically, machine listening technologies underpin the vast databases of vocal biometrics now held by many [prison providers](https://theintercept.com/2019/01/30/prison-voice-prints-databases-securus/ "Prisons Across the U.S. Are Quietly Building Databases of Incarcerated People’s Voice Prints") and, for instance, the [Australian Tax Office](https://www.computerworld.com/article/3474235/the-ato-now-holds-the-voiceprints-of-one-in-seven-australians.html "The ATO now holds the voiceprints of one in seven Australians"). And they are quickly being integrated into infrastructures of development, security and policing.
![Automatic speech recognition](audio:static/audio/kathy-reid-intro-to-ASR.mp3),[^kathy_audio_1] transcription and translation -
![Automatic speech recognition](audio:static/audio/kathy-reid-intro-to-ASR.mp3),[^kathy_audio_1] transcription and translation {{< nosup >}}[[i](https://www.statnews.com/2020/05/22/ai-startup-transcribes-annotates-doctor-visits-for-patients/ "AI startup transcribes and annotates doctor visits for patients"), [ii](https://www.iflytek.com/en/products/#/Home "iFlyTek: Create a better world with A.I."), [iii](https://www.wired.com/story/iflytek-china-ai-giant-voice-chatting-surveillance/ "How a Chinese AI Giant Made Chatting—and Surveillance—Easy")]{{< /nosup >}} -
targeted key word detection {{< nosup >}}[[i](https://theintercept.com/2015/05/05/nsa-speech-recognition-snowden-searchable-text/ "How the NSA Converts Spoken Words Into Searchable Text")]{{< /nosup >}} -
targeted key word detection {{< nosup >}}[[i](https://theintercept.com/2015/05/05/nsa-speech-recognition-snowden-searchable-text/ "How the NSA Converts Spoken Words Into Searchable Text")]{{< /nosup >}} -
vocal biometrics and audio fingerprinting {{< nosup >}}[[i](https://www.nice.com/engage/real-time-technology/voice-biometrics/ "NICE leverages voice biometrics for safer and more secure customer authentication"), [ii](https://www.acrcloud.com/audio-fingerprinting/ "What Is Audio Fingerprinting?")]{{< /nosup >}} -
vocal biometrics and audio fingerprinting {{< nosup >}}[[i](https://www.nice.com/engage/real-time-technology/voice-biometrics/ "NICE leverages voice biometrics for safer and more secure customer authentication"), [ii](https://www.acrcloud.com/audio-fingerprinting/ "What Is Audio Fingerprinting?")]{{< /nosup >}} -
speaker identification, differentiation, enumeration and location {{< nosup >}}[[i](https://theintercept.com/2018/01/19/voice-recognition-technology-nsa/ "Finding Your Voice"), [ii](https://patents.google.com/patent/US20100235169A1/en "Google Speech differentiation Patent")]{{< /nosup >}} -
speaker identification, differentiation, enumeration and location {{< nosup >}}[[i](https://theintercept.com/2018/01/19/voice-recognition-technology-nsa/ "Finding Your Voice"), [ii](https://patents.google.com/patent/US20100235169A1/en "Google Speech differentiation Patent")]{{< /nosup >}} -
@@ -30,7 +30,7 @@ adversarial music {{< nosup >}}[[i](https://arxiv.org/abs/1911.00126 "Real World
brand sonification {{< nosup >}}[[i](https://www.audioanalytic.com/brand-sonification-power-recognising-sounds-brands/ "RBrand sonification: The power of recognising the sounds of brands")]{{< /nosup >}} -
brand sonification {{< nosup >}}[[i](https://www.audioanalytic.com/brand-sonification-power-recognising-sounds-brands/ "RBrand sonification: The power of recognising the sounds of brands")]{{< /nosup >}} -
aggression detection {{< nosup >}}[[i](https://www.soundintel.com/products/overview/aggression/ "Deterring and Preventing Assault"), [ii](https://www.audeering.com/what-we-do/automotive/ "Cars take care of their passengers")]{{< /nosup >}} -
aggression detection {{< nosup >}}[[i](https://www.soundintel.com/products/overview/aggression/ "Deterring and Preventing Assault"), [ii](https://www.audeering.com/what-we-do/automotive/ "Cars take care of their passengers")]{{< /nosup >}} -
emotion detection - {{< nosup >}}[[i](https://www.theverge.com/platform/amp/2020/8/27/21402493/amazon-halo-band-health-fitness-body-scan-tone-emotion-activity-sleep?__twitter_impression=true&s=09 "Amazon announces Halo, a fitness band and app that scans your body and voice")]{{< /nosup >}}
emotion detection - {{< nosup >}}[[i](https://www.theverge.com/platform/amp/2020/8/27/21402493/amazon-halo-band-health-fitness-body-scan-tone-emotion-activity-sleep?__twitter_impression=true&s=09 "Amazon announces Halo, a fitness band and app that scans your body and voice")]{{< /nosup >}}
distress detection -
distress detection -
intoxication detection {{< nosup >}}[[i](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3872081/ "Intoxicated Speech Detection: A Fusion Framework with Speaker-Normalized Hierarchical Functionals and GMM Supervectors")]{{< /nosup >}} -
intoxication detection {{< nosup >}}[[i](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3872081/ "Intoxicated Speech Detection: A Fusion Framework with Speaker-Normalized Hierarchical Functionals and GMM Supervectors")]{{< /nosup >}} -
@@ -38,7 +38,6 @@ scream detection -
lie detection -
lie detection -
hoax detection {{< nosup >}}[[i](https://amp.abc.net.au/article/12568084 "University of Southern Queensland gets $300k for hoax emergency call detection technology")]{{< /nosup >}} -
hoax detection {{< nosup >}}[[i](https://amp.abc.net.au/article/12568084 "University of Southern Queensland gets $300k for hoax emergency call detection technology")]{{< /nosup >}} -
gunshot detection - {{< nosup >}}[[i](https://www.shotspotter.com/ "Enhance Public Safety and SiteSecurity with ShotSpotter")]{{< /nosup >}}
gunshot detection - {{< nosup >}}[[i](https://www.shotspotter.com/ "Enhance Public Safety and SiteSecurity with ShotSpotter")]{{< /nosup >}}
autism diagnosis -
parkinson's diagnosis {{< nosup >}}[[i](http://www.canaryspeech.com/ "Using voice to identify human conditions sooner.")]{{< /nosup >}} -
parkinson's diagnosis {{< nosup >}}[[i](http://www.canaryspeech.com/ "Using voice to identify human conditions sooner.")]{{< /nosup >}} -
covid diagnosis {{< nosup >}}[[i](https://app.surveylex.com/surveys/5384d6d0-6499-11ea-bc3a-b32c3ca92036 "We are launching an initiative to collect your voices with a goal to be able to triage, screen and monitor COVID-19 virus.")]{{< /nosup >}} -
covid diagnosis {{< nosup >}}[[i](https://app.surveylex.com/surveys/5384d6d0-6499-11ea-bc3a-b32c3ca92036 "We are launching an initiative to collect your voices with a goal to be able to triage, screen and monitor COVID-19 virus.")]{{< /nosup >}} -
machine fault diagnosis - psychosis diagnosis {{< nosup >}}[[i](https://www.sciencedaily.com/releases/2019/06/190613104552.htm "The whisper of schizophrenia: Machine learning finds 'sound' words predict psychosis")]{{< /nosup >}} -
machine fault diagnosis - psychosis diagnosis {{< nosup >}}[[i](https://www.sciencedaily.com/releases/2019/06/190613104552.htm "The whisper of schizophrenia: Machine learning finds 'sound' words predict psychosis")]{{< /nosup >}} -
gaming - {{< nosup >}}[[i](https://voicebot.ai/2020/06/05/new-sony-patent-elaborates-how-the-playstation-5-voice-assistant-will-help-you-kill-zombies/ "New Sony Patent Elaborates How the PlayStation 5 Voice Assistant Will Help You Kill Zombies")]{{< /nosup >}}
brand development {{< nosup >}}[[i](https://www.adnews.com.au/news/the-most-effective-brand-audio-logos-in-australia "The most effective brand audio logos in Australia")]{{< /nosup >}} -
marketing {{< nosup >}}[[i](https://www.veritonic.com/ "Veritonic The Sonic Truth")]{{< /nosup >}} -
marketing {{< nosup >}}[[i](https://www.veritonic.com/ "Veritonic The Sonic Truth")]{{< /nosup >}} -
acoustic ecology -
acoustic ecology -
employee performance metrics -
employee performance metrics -
@@ -105,7 +104,7 @@ Scientifically, machine listening demands enormous volumes of data: exhorted, ex
Because machine listening is trained on (more-than) human auditory worlds, it inevitably encodes, invisibilises and reinscribes normative listenings, along with a range of more arbitrary artifacts of the datasets, statistical models and computational systems which are at once its lifeblood and fundamentally opaque.[^McQuillan] This combination means that machine listening is simultaneously an alibi or front for the proliferation and normalisation of specific auditory practices *as* machinic, and, conversely, often irreducible to human apprehension; which is to say the worst of both worlds.
Because machine listening is trained on (more-than) human auditory worlds, it inevitably encodes, invisibilises and reinscribes normative listenings, along with a range of more arbitrary artifacts of the datasets, statistical models and computational systems which are at once its lifeblood and fundamentally opaque.[^McQuillan] This combination means that machine listening is simultaneously an alibi or front for the proliferation and normalisation of specific auditory practices *as* machinic, and, conversely, often irreducible to human apprehension; which is to say the worst of both worlds.
Moreover, because machine listening is so deeply bound up with logics of automation and pre-emption, it is also recursive. It feeds its listenings back into the world - ![gendered and gendering](audio:https://machinelistening.exposed/library/Yolande%20Strengers,%20Jenny%20Kennedy,%20Ja/Yolande%20Strengers%20and%20Jenny%20Kennedy%20(10)/Yolande%20Strengers%20and%20Jenny%20Ken%20-%20Yolande%20Strengers,%20Jenny%20Kenned.mp3|951000|1390000),[^YS] colonial and colonizing, ![raced and racializing](audio:static/audio/halcyon-siri-imperialism.mp3),[^halcyon_audio_1] classed and productive of class relations - as Siri's answer or failure to answer; by alerting the police, denying your [claim for asylum](https://www.theverge.com/2017/3/17/14956532/germany-refugee-voice-analysis-dialect-speech-software), or continuing to play Autechre - and this incites an auditory response to which it listens in turn. The soundscape is increasingly cybernetic. Confronting machine listening means recognising that common-sense distinctions between human and machine simply fail to hold. We are all machine listeners now. We have been becoming machine listeners for a long time. Indeed, the becoming machinic of listening is a foundational concern for any contemporary politics of listening; not because mechanisation *itself* is a problem, but because it is the condition in which we increasingly find ourselves.[^Abu Hamdan]
Moreover, because machine listening is so deeply bound up with logics of automation and pre-emption, it is also recursive. It feeds its listenings back into the world - ![gendered and gendering](audio:https://machinelistening.exposed/library/Yolande%20Strengers,%20Jenny%20Kennedy,%20Ja/Yolande%20Strengers%20and%20Jenny%20Kennedy%20(10)/Yolande%20Strengers%20and%20Jenny%20Ken%20-%20Yolande%20Strengers,%20Jenny%20Kenned.mp3|952000|1390000),[^YS] colonial and colonizing, ![raced and racializing](audio:static/audio/halcyon-siri-imperialism.mp3),[^halcyon_audio_1] classed and productive of class relations - as Siri's answer or failure to answer; by alerting the police, denying your [claim for asylum](https://www.theverge.com/2017/3/17/14956532/germany-refugee-voice-analysis-dialect-speech-software), or continuing to play Autechre - and this incites an auditory response to which it listens in turn. The soundscape is increasingly cybernetic. Confronting machine listening means recognising that common-sense distinctions between human and machine simply fail to hold. We are all machine listeners now. We have been becoming machine listeners for a long time. Indeed, the becoming machinic of listening is a foundational concern for any contemporary politics of listening; not because mechanisation *itself* is a problem, but because it is the condition in which we increasingly find ourselves.[^Abu Hamdan]
But machine listening isn't exactly listening either.
But machine listening isn't exactly listening either.