Human-Machine Interactive Composition Using Machine Learning

We will develop a software program that interacts with human musicians to automatically co-author music in real time using machine learning. Our real-time interactive system contributes to and draws from already existing branches of study in music composition and computer science. From computer science, the system applies techniques from Music Information Retrieval (MIR) and Machine Learning to analyze and generate musical content. Within the domain of music composition, our piece aims to develop an interactive digital framework for gesture-based music improvisation. This latter point distinguishes the project from previous work in machine improvisation/composition which, for the most part, has historically attempted to mimic musical styles by generating melodies. In contemporary practices of music improvisation and performance, melody is not the sole descriptor of a musical style. Performers communicate more with musical actions and sonic events within a conditioned space, rather than pre-established harmonic movement or melodic movement.

Our interactive system will be used in the creation of a new musical work composed by Mr. Rubin. This work will be scored for three improvising musicians (proposed instrumentation: piano, clarinet, and cello) and interactive digital system. The composer will use a flexible score will serve to guide the musicians in the creation of improvised textures, interactive situations, and a malleable large-scale form that may be stretched and compressed at the musicians’ will. The use of a score will allow us to have a high level control of the performance and will also allow us to create labels for training data. This paradigm provides an opportunity to create idiomatic, novel, and complex textures that do not rely on explicit notation or fixed instrumental technique. Rather, each musician will find their own techniques of interpreting the notation depending on what works most idiomatically on their instrument.

From a technological perspective, the project employs tools from Music Information Retrieval and Machine Learning. Schedl, Gómez, and Urbano (2014) provides a summary of the feature extraction techniques that will be used to process live data to form the digital representation of the acoustic signal. Using that digital representation we can use a variety of machine learning techniques for different purpose. We can use hierarchical and non-hierarchical methods for song segmentation and sound clustering such as in Barrington, Chan, and Lanckriet (2009), which allows us to represent higher-order features that describe the structure of the piece. We can use gesture recognition techniques as outlined Mitra and Acharya (2007) and other pattern recognition techniques and apply them to music to identify specific musical gestures or musical phrases that are performed by the musicias. In effect, using machine learning will create a semantic representation of the music that is greater than the mere sum of the acoustic signals.

Within the framework of this piece, our interactive system will listen to the improvisers’ sound signal and respond using generative methods from algorithmic composition (Nierhaus (2009)) conditioned on an analysis of recorded data gathered from previous performances. The system will function as another improvisor, adding to the musical
discourse and altering the musical and aesthetic landscape of the piece.

Researchers

Juan Hernandez, Scott Rubin

Department/school

Music, Center for New Music and Audio Technologies

Project type

Research

Funding

Andrew W. Mellon Foundation