|
@@ -1,6 +1,6 @@ |
|
|
--- |
|
|
--- |
|
|
title: "Mara Mills, Xiaochang Li, Jessica Feldman, Michelle Pfeifer" |
|
|
title: "Mara Mills, Xiaochang Li, Jessica Feldman, Michelle Pfeifer" |
|
|
Description: "Mara, Xiaochang, Jessica and Michelle talk us through the history and politics of machine listening, from ‘affect recognition’ and the ‘statistical turn’ in ASR to automated accent detection at the German border, voiceprints and the ‘assistive pretext’. This is an expansive conversation with an amazing group of scholars, who share a common connection to the www⁄Media, Culture, and Communications department at NYU, founded by Neil Postman in 1971 at the urging of Marshall McLuhan." |
|
|
|
|
|
|
|
|
Description: "Mara, Xiaochang, Jessica and Michelle talk us through the history and politics of machine listening, from ‘affect recognition’ and the ‘statistical turn’ in ASR to automated accent detection at the German border, voiceprints and the ‘assistive pretext’. This is an expansive conversation with an amazing group of scholars, who share a common connection to the Media, Culture, and Communications department at NYU, founded by Neil Postman in 1971 at the urging of Marshall McLuhan." |
|
|
aliases: [] |
|
|
aliases: [] |
|
|
author: "Machine Listening" |
|
|
author: "Machine Listening" |
|
|
date: "2020-08-18T00:00:00-05:00" |
|
|
date: "2020-08-18T00:00:00-05:00" |
|
@@ -50,7 +50,7 @@ Xiaochang Li (00:09:02) - I had started a project on thinking about machine lear |
|
|
|
|
|
|
|
|
Xiaochang Li (00:10:00) - Right. In terms of like making it thinkable that we can not only sort of tally, but also sort of determine meaning within language, by sorting through it. Right. And so the sound piece comes in because part of that history is really about the development of speech recognition and not only how sort of language processing to be, but specifically the challenge of doing acoustic signal processing alongside language. And I accidentally, like I came upon it almost entirely by accident because I made a mistake in my early research. So I was working on text prediction technologies, and I was reading up about T9 and I sort of read in a passing interview with one of the creators of T9 that it originated, as assistive technology. And so I immediately assumed that the assistive technology and question was speech recognition because that sort of made sense to me in terms of like a legacy of assisted technology coming into something like auto correct. |
|
|
Xiaochang Li (00:10:00) - Right. In terms of like making it thinkable that we can not only sort of tally, but also sort of determine meaning within language, by sorting through it. Right. And so the sound piece comes in because part of that history is really about the development of speech recognition and not only how sort of language processing to be, but specifically the challenge of doing acoustic signal processing alongside language. And I accidentally, like I came upon it almost entirely by accident because I made a mistake in my early research. So I was working on text prediction technologies, and I was reading up about T9 and I sort of read in a passing interview with one of the creators of T9 that it originated, as assistive technology. And so I immediately assumed that the assistive technology and question was speech recognition because that sort of made sense to me in terms of like a legacy of assisted technology coming into something like auto correct. |
|
|
|
|
|
|
|
|
Mara Mills (00:10:59) - That was completely incorrect, that it turned out what he had originally designed T9 for and what makes complete sense now, as we think about it was eye tracking, you have nine positions, et cetera. But at that point I had already started down the rabbit hole of the history of speech recognition and started to see the ways in which it was such a crucial piece in the puzzle of making language thinkable as a computational project. And sort of addressing the question of why it is that we keep trying to make computers do language things, even though they're extremely bad at precisely doing language things, but also in doing perceptual things. Things like listening for instance, because they don't have the same kind of perceptual coordinates as people do. They don't recognize sounds as sounds per se. And so I had already gone so deep down that rabbit hole and it turned out that it did in fact have so much to do with this history that I was interested in, that it just sort of worked out. |
|
|
|
|
|
|
|
|
Xiaochang Li (00:10:59) - That was completely incorrect, that it turned out what he had originally designed T9 for and what makes complete sense now, as we think about it was eye tracking, you have nine positions, et cetera. But at that point I had already started down the rabbit hole of the history of speech recognition and started to see the ways in which it was such a crucial piece in the puzzle of making language thinkable as a computational project. And sort of addressing the question of why it is that we keep trying to make computers do language things, even though they're extremely bad at precisely doing language things, but also in doing perceptual things. Things like listening for instance, because they don't have the same kind of perceptual coordinates as people do. They don't recognize sounds as sounds per se. And so I had already gone so deep down that rabbit hole and it turned out that it did in fact have so much to do with this history that I was interested in, that it just sort of worked out. |
|
|
|
|
|
|
|
|
Mara Mills (00:12:00) - I want to just jump in and say, this was one I didn't realize you mentioned that Xiaochang and I had written together, but we didn't realize till very late, I think after you even finished your PhD where our own research intersected, because I knew you as doing work on text prediction with a tiny fragment of it being around speech track, and then all of a sudden our work converged around the sound spectrograph and understanding speech in terms of filtered speech in a spectral view. I had done work on the pre-history of that machine. And you had done work after World War II, and it was just really fortuitous that we were able to like unite, especially around a machine called Audrey. The automatic digit recognizer created at Bell Labs as the first presumably arguably speech recognizer, but it was like, I couldn't have done that work on my own. |
|
|
Mara Mills (00:12:00) - I want to just jump in and say, this was one I didn't realize you mentioned that Xiaochang and I had written together, but we didn't realize till very late, I think after you even finished your PhD where our own research intersected, because I knew you as doing work on text prediction with a tiny fragment of it being around speech track, and then all of a sudden our work converged around the sound spectrograph and understanding speech in terms of filtered speech in a spectral view. I had done work on the pre-history of that machine. And you had done work after World War II, and it was just really fortuitous that we were able to like unite, especially around a machine called Audrey. The automatic digit recognizer created at Bell Labs as the first presumably arguably speech recognizer, but it was like, I couldn't have done that work on my own. |
|
|
|
|
|
|
|
|