|
|
@@ -7,25 +7,27 @@ has_experiments: [] |
|
|
|
|
|
|
|
## Interactive (music) systems |
|
|
|
|
|
|
|
{{< youtube EZvK8atIlnA >}} |
|
|
|
|
|
|
|
A young, white man is sitting at a white desk. In front of him is a light grey DECstation 5000/200 computer and a black microphone. A simple melody plays. He begins tapping out a rhythm on the desk with his hands. As he modifies his tempo, the system responds. He is improvising. The machine is listening. |
|
|
|
|
|
|
|
Is the man playing the computer - like an instrument? Or is he playing with the computer - as in a duet? Is he even playing at all? |
|
|
|
|
|
|
|
The man appears slightly bored, pretending not to be aware of his own performance, exploring the limited freedom offered to him by the machine, which tirelessly repeats the melody again and again, infinitely. We are watching a breakthrough moment in human-computer interaction: the computer is doing what the man wants. But still, the man can only want what the machine can do. |
|
|
|
|
|
|
|
The fantasy of easy, natural interface between man and a computer is captured in a diagram by Andrey Yershov in 1964 titled the 'director agent model of interaction'. The man is meant to be in charge. But look at the diagram. We can start anywhere we like. Information cycles around and around, in a constant state of transformation from sound to voltage to colored light to wet synapses. All of these possibilities are contained in the schematic figure of the arrow. Which is the director here? And which the agent? The diagram itself cycled between the pages of different publications, including The Architecture Machine, a 1970 book by Nicholas Negroponte. Negroponte had set up the Architecture Machine Group at MIT in 1967, which eventually led to his creation of the MIT Media Lab. |
|
|
|
The fantasy of easy, natural interface between man and a computer is captured in a diagram by Andrey Yershov in 1964 titled the 'director agent model of interaction'. The man is meant to be in charge. But [look at the diagram](https://machinelistening.tumblr.com/post/635556111569862656/the-yershov-diagram-1963-as-early-component-of). We can start anywhere we like. Information cycles around and around, in a constant state of transformation from sound to voltage to colored light to wet synapses. All of these possibilities are contained in the schematic figure of the arrow. Which is the director here? And which the agent? The diagram itself cycled between the pages of different publications, including The Architecture Machine, a 1970 book by Nicholas Negroponte [^Negroponte]. Negroponte had set up the Architecture Machine Group at MIT in 1967, which eventually led to his creation of the [MIT Media Lab](https://www.media.mit.edu/). |
|
|
|
|
|
|
|
A state of the art, light grey machine is sitting on a white desk. A camera is pointed at it, focused on it. The camera zooms out to reveal a young, white man. Why are we seeing this moment? Why is the camera there to witness it? Judging by the DECstation, the year is probably 1992 or 1993, the location is definitely the MIT Media Lab, and we are looking through a small window into the "demo or die" culture that Negroponte famously instigated there. Demos could excite the general public and impress important visitors. They could attract corporate and government money. The colossal "machine listening" apparatus that we know today has its roots in thousands of demos like this one. |
|
|
|
|
|
|
|
In the 80s and 90s, the demo was a prefigurative device also familiar to the music world. This man, like many of those he worked with, moved between these worlds. He was an engineer, but also a musician. He had come to MIT, in fact, for a PhD in the "Music and Cognition" group at the Experimental Music Studio, which had been founded by the composer Barry Vercoe in 1973 and was absorbed into the Media Lab from the very start in 1985. |
|
|
|
In the 80s and 90s, the demo was a prefigurative device also familiar to the music world. This man, like many of those he worked with, moved between these worlds. He was an engineer, but also a musician. He had come to MIT, in fact, for a PhD in the "Music and Cognition" group at the [Experimental Music Studio](https://www.media.mit.edu/events/EMS/bv-interview.html), which had been founded by the composer [Barry Vercoe](https://web.media.mit.edu/~bv/) in 1973 and was absorbed into the Media Lab from the very start in 1985. |
|
|
|
|
|
|
|
One of the first students to join this group was another musician-engineer named Robert Rowe. Rowe's doctoral thesis 'Machine Listening and Composing: Making Sense of Music with Cooperating Real-Time Agents' seems to be one of the earliest uses of the phrase 'machine listening' in print. 'A primary goal of this thesis,' Rowe writes, 'has been to fashion a computer program able to listen to music'. The term 'machine listening' would go on to be taken up widely in computer music circles following the publication of a book based on Rowe's thesis, in 1994. |
|
|
|
One of the first students to join this group was another musician-engineer named [Robert Rowe](https://wp.nyu.edu/robert_rowe/home/). Rowe's doctoral thesis 'Machine Listening and Composing: Making Sense of Music with Cooperating Real-Time Agents' seems to be one of the earliest uses of the phrase 'machine listening' in print. 'A primary goal of this thesis,' Rowe writes, 'has been to fashion a computer program able to listen to music'. [^Rowephd] The term 'machine listening' would go on to be taken up widely in computer music circles following the publication of a book based on Rowe's thesis, in 1993.[^Rowe] |
|
|
|
|
|
|
|
1994 was also the year the "Music and Cognition" group rebranded. |
|
|
|
The following year the "Music and Cognition" group rebranded. |
|
|
|
|
|
|
|
>"I have been unhappy with 'music (and) cognition' for some time. It's not even supposed to describe our group; it was the name of a larger entity including Barry, Tod, Marvin, Ken and Pattie that was dissolved almost two years ago. But I've shied away from the issue for fear of something worse. I like Machine Listening a lot. I've also thought about Auditory Processing, and I try to get the second floor to describe my demos as Machine Audition. I'm not sure of the precise shades of connotation of the different words, except I'm pretty confident that having 'music' in the title has a big impact on people's preconceptions, one I'd rather overcome." |
|
|
|
|
|
|
|
So what began, for Rowe, as a term to describe the so-called 'analytic layer' of an 'interactive music system' became the name of a new research group at MIT [show website?] and something of a catchall to describe diverse forms of emerging computational auditory analysis, increasingly involving big data and machine learning techniques. As the term wound its way through the computer music literature, it also followed researchers at MIT as they left, finding its way into funding applications and the vocabularies of new centers at new institutions. |
|
|
|
So what began, for Rowe, as a term to describe the so-called 'analytic layer' of an 'interactive music system'[^Rowe] became the name of a [new research group at MIT](https://web.archive.org/web/19961130111950/http://sound.media.mit.edu/) and something of a catchall to describe diverse forms of emerging computational auditory analysis, increasingly involving big data and machine learning techniques. As the term wound its way through the computer music literature, it also followed researchers at MIT as they left, finding its way into funding applications and the vocabularies of new centers at new institutions. |
|
|
|
|
|
|
|
Here is one such application, by a Professor at Columbia named Dan Ellis. This is the man sitting at the desk and the author of the email we just read. Today he works at Google, for their 'Sound Understanding Team'. As Stewart Brand once put it, 'The world of the Media Lab and the media lab of the world are busily shaping each other.' |
|
|
|
|
|
|
@@ -86,3 +88,10 @@ In addition to jazz robots, DARPA has two other programs, Improv and Improv2, in |
|
|
|
Improvisation is both a technique and a generative source of knowledge to extract. |
|
|
|
|
|
|
|
But one thing DARPA's Improv program manager says reminds us that their improvisational imaginary has real constraints: "DARPA’s in the surprise business and part of our goal is to prevent surprise." In time, there is no more need for human input in the ensemble. The machine improvises with itself. |
|
|
|
|
|
|
|
|
|
|
|
# Resources |
|
|
|
|
|
|
|
[^Negroponte]: ![](bib:7cd09072-5282-441f-b30a-6d869488ecd8) |
|
|
|
[^Rowephd]: Robert Rowe, [_Machine Listening and Composing: Making Sense of Music with Cooperating Real-Time Agents_](https://dspace.mit.edu/handle/1721.1/13835), doctoral thesis (MIT, 1991) |
|
|
|
[^Rowe]: Robert Rowe, [_Interactive Music Systems: Machine Listening and Composing_](https://wp.nyu.edu/robert_rowe/text/interactive-music-systems-1993/) (MIT, 1993) |