Image - this site offers

Hitting the Right Note:
Interacting with Music through Computers

David Jennings

The discipline of Human-Computer Interaction first developed to deal with applications of computing that are qualitatively different from making and hearing music. The grammar of the interactions in improvising jazz, or, say, composing a "bedroom techno" dance track, are not something you can imagine capturing easily through task analysis. The goal- and task-driven model of interaction - or even later variants such as "Activity Theory" - cannot comfortably be applied as an account of artistic activity. So what can HCI tell us about music technology, and what can music technology tell us about HCI?

This article is loosely based on work commissioned by the developers of the National Centre for Popular Music (NCPM), a visitor attraction which plans to open in Sheffield in early 1999 and is constituted as an educational charity. The NCPM aims to celebrate one of the most influential and popular artforms of the twentieth century, through a partnership of entertainment and education, creativity and technological innovation.

This work has surveyed the range of new music technologies available, and assessed their potential applications within the Centre's range of exhibits. This article is a more theoretical review of existing and potential applications.

Developments in Music Technology


The development of computer power to the point where musical signals can be processed in real time has generated a rapid growth of software tools and games that offer ways of making music unlike any available before. My research has covered a range of overlapping applications, as shown in Figure 1.

Images - applications


Novel instruments, based on musical instrument digital interface (MIDI) technologies, are being developed in a number of research centres, such as the Studio for Electro Instrumental Music (STEIM) in the Netherlands and the MIT Media Lab. MIDI enables any kind of movement or signal to be converted into sound production or manipulation. One example is STEIM's "Sweatstick" - a one meter aluminium stick with a stiff spring in the centre, and two gliding keyboard pads attached around the stick. Performances are a variety of martial arts, dancing with a broomstick, praying and playing the guitar. Other examples allow video signals - say, from a dance performance - to trigger musical sounds. There are also many applications which allow people with disabilities to make music, by requiring forms of input other than the dexterity or mobility that traditional instruments involve. The possibilities for redefining the composer-music-performer-listener relationship can be quite profound.

New technologies have also supported the growth in sound environments and "soundscapes" - the aural equivalent of immersive virtual reality or an IMAX cinema. MIT Media Lab's Tod Machover is one of the best known exponents. His recent "Brain Opera" multimedia event included a large interactive display, responsive to crowd presence and movement, placed outside the main performance theatre to reflect on-site and Internet activities. There was also an "Experience Space" made up of a maze of interactive music/image experiences with titles like "rhythm tree", "harmonic driving", "gesture wall" and "melody easel" which could be explored at will, including the connections between them.

For thousands of years the relationship of input to output for musical instruments has been physically constrained. The affordances of instruments were all some variation of strumming, banging, blowing or bowing. The input/output relationship has become gradually more malleable with the coming first of electric amplification and effects, then analogue and digital synthesisers, and finally MIDI and Digital Signal Processing. It is now possible to aim for more "direct" means of controlling sound and music, as well as methods which are extremely complex or abstract.

An example where the former approach can be useful is in educational applications, where people can model effects of "input" on musical "output" in order to understand better the relationship between the two. SYnthia is a computer-assisted learning package, developed at the University of Huddersfield, which enables students to learn the principles of sound synthesis through text and diagrams on screen. They can then hear and manipulate sound examples in real time. Graphic controllers allow students to shape sounds that are produced by a synthesiser. The developers claim that, since learning is directly related to the immediate experience of putting theory into practice, the process is more meaningful and more memorable.

The separation which new music technologies achieve between input and output opens up a new space for music production, manipulation and re-production, which can be filled in many different ways. The resulting possibilities for changing how we think about composing and listening to music are the focus of much of the rest of this article.

Reengineering Musical Activity


In his recent article, Music and technology : the composer in the age of the Internet , Stephen Deutsch writes:

The process by which most of the music I write is composed, and the process towards which computer assisted composition is biased, centres upon the ear. Typically, a composer plays - improvises the music - for as long as s/he wishes, listens and then edits the music. Such postponement of the judgmental process is central to the process.

By composing the sounds before the notation, or by eliminating notation entirely [as new software tools allow], composers find their music changes radically. With absence of notation, elements of musical style which are system driven begin to lose their appeal More importantly, the use of this technology shifts the locus of significant activity from the composer's intentions to the listener's perceptions.


As well as affecting composers, the mediation of technology can also have a direct impact on listeners. For example, many of the tracks on the multimedia part of the Header #1 CD Plus (see review in this issue) do not exist in any definitive "composer's cut" version (although composers, performers and producers have been involved in the production). Exactly how the track sounds depends on the state of the system at the time I "launch" it, and then on various simple actions I make with the mouse. The "locus of significant activity", as Deutsch calls it, is again shifting between listener and composer/producer.

These are two examples of a wider range of tools for making music, that can be divided into the following overlapping categories:

"Generative" composition tools - Recently publicised by Brian Eno who has brought out a floppy disk of generative compositions which never sound exactly the same twice, but are grown out of Eno's compositional "seeds" at playtime. Other examples include the PushBtnBach and CyberMozart MIDI software packages which use algorithms to generate compositions in the styles of these composers.

Mixing simulations - Of which there are a wide range of examples, from industrial strength tools like Logic Audio and Music X to games and toys like Mixman and frEQout , which allow the user to mix elements from the latest dance music genres. In between, there are more educationally-inspired tools like Roland's DoReMix , which enables non-musicians to "compose" and arrange music in a variety of styles without any use of notation.

Automated improvisation - A much less well-developed field, which combines elements of both generative composition tools and the "listening" abilities of intelligent accompaniment systems.

"Intelligent" accompaniment - Intelligent Accompaniment is actually a trademark of Coda Music Technology, whose Vivace product listens to and follows a soloist's tempo changes. It can be set to allow the soloist varying degrees of freedom for musical interpretation.

Music notation tools - music notation software assumed that composers want to use and generate traditional format scores for their music (unlike the tools discussed by Stephen Deutsch at the beginning of this section). Systems like Sibelius 7 allow composers to either write or amend a score, and then hear what it sounds like when played through MIDI. Or they can play music from a MIDI keyboard into the system, which will then generate the score for that music.

Online jamming - On line jamming is CSCW for musicians, taking advantage of that fact that, while CD quality sound requires an enormous number of bytes per second, MIDI control instructions do not.

While these categories are flexible and overlapping, it is possible to characterise some of the differences between them in terms of the control over musical content which they afford, and the degree of interactivity they offer (though this immediately reveals that interactivity is not a single dimension: it comes in different flavours).

Image - categories


Modelling Creative Activity


Ten years ago, Donald Norman's framework of the "gulfs of execution and evaluation" between the psychological world (what people want to do) and the physical world (their progress in doing it) gave us a powerful framework for thinking about human-computer interaction. The design philosophy that came out of this was geared to bridging these gulfs as quickly, comfortably and effectively as possible: designing tools to get things done. The aim of design was to make it easier to go from formulating intentions to putting them into action through the technology, and to make it easier to interpret and evaluate the "outputs" from the technology. But in music, the composer's intentions and judgement, and the listener's interpretations are not a means to an end - they are everything there is . It is not a case of making it easier to get things done - more a case of enriching the experience of doing things.

Arguably, the rigidity of distinctions between composer, performer and listener has only hardened this century following the development of recording technology. However, new technologies are certainly breaking down these distinctions again. The bare bones of making music can now be made very simple: the aesthetics of major elements of music seem to be more tractable to algorithmic representation than in other artforms. As it becomes easier to automate the physical process of making music, more people are shifting their focus to a broad range of re-mixing activities. They take a "raw" piece of music, interpret it aesthetically and then modify it with various treatments. This is something that composers, DJs in dance clubs, and now listeners with technologies like the Header CD-ROM all have in common.

Just as the definition of intelligence has been reframed in the light of what artificial intelligence can and cannot do, the essence of music may become defined by what the technologies cannot do. As harmony, rhythm and tempo can be turned into algorithms and computed, music composers may turn to the currency of more abstract textures and soundscapes that are harder to "automate". One of the hardest musical forms to automate should be the group improvisation, because it involves a complex mix of cognitive processes

  • listening to music and interpreting what you hear
  • using this interpretation to frame your own spontaneous composition
  • playing this composition

(The complexity is amplified if we accept improvisers' experience that these are not sequential processes, but are all bound up together.)

Technologies for intelligent accompaniment and generative composition can emulate some of these processes but not all of them together. And group improvisation is more than just stitching them together. It may involve complex social interactions between musicians which are not solely musical. Working on the premise that this process may be similar to the negotiation of turn-taking in conversation, Bill Walker at Apple has developed a research prototype, known as ImprovisationBuilder , which models how jazz players trade solos between each other. Walker seeks to apply some of the findings from conversation analysis studies to improvisation, in the same way that they have been applied to more traditional HCI domains. These models appear to work reasonably well in contexts where the musical rules are quite tight. So ImprovisationBuilder might work well for the bebop jazz - where the sequence of head, soloing and trading fours is fairly fixed. It is less clear how you would model the freer mutations of jazz and other improvisational forms that have evolved since the '60s.

Future Directions


Music technology is currently confined to a small range of applications and outlets, and poses little threat to established traditions of musical activity. But new ways of thinking about music and composition are already creeping in at the margins - the DJ remix culture or dance music being the most obvious example. Rave culture has already found applications for technologies that create synaesthetic cocktails of music and graphics. The new "crossover" technologies like the Header CD-ROM are extending these trends, and also providing new means to reframe the experience of listening (actively) to music. Their use of graphic manipulation to give the listener varying degrees of control to modify the music points to future instruments which might be genuinely worthy of the term "multimedia". Instead of audio and graphics running in parallel but independently, we may be able to talk about "painting with sound" as a reality rather than a metaphor. This would be one of a wide range of tools aimed at a spectrum of musical contexts, requiring varying kinds of compositional, arrangement and production skills from users. They may also require new ways of thinking about interaction with technology.


Resources

Research and Development Centres


Birmingham Electro Acoustic Studio (BEAST) - http://sun1.bham.ac.uk/a.j.moore/docs/beast2.html
Centre for New Music and Audio Technologies, Berkeley, California - http://www.cnmat.berkeley.edu/
Studio for Electro Instrumental Music (STEIM), Netherlands - http://www.dds.nl/~steim/
MIT Media Lab, Brain Opera - http://brainop.media.mit.edu/
Music Technology Group, University of York - http://www.york.ac.uk/inst/mustech/
Electronic Studio, University of Leeds - http://www.leeds.ac.uk/music/Studio/es.html
Institut de Récherche et Coordination Acoustique/Musique - http://www.ircam.fr/

Music Technology Products


Generative Music: SSEYO's Koan software - http://www.sseyo.com/
Header - http://www.tui.co.uk/header/
AudioROM - http://www.audiorom.com/
Modified - http://www.compulink.co.uk/~modified/
MixMan - http://www.mixman.com/

Music Technology Theory


Algorithmic Composition - http://www.bath.ac.uk/~mapjll/algo-comp.html
ImprovisationBuilder - http://www.atg.apple.com/personal/Bill_Walker/

MIDI


Introduction into MIDI (by Eric Lipscomb) - http://www.eeb.ele.tue.nl/midi/intro.html
Interactive MIDI environments: iCube - http://www.infusionsystems.com/ ; Resrocket - http://www.resrocket.com/ ; comMIDI - http://www.voicenet.com/~bkirsch/comMIDI.html

Other References


Axel Mulder's list of web pages (HCI, music and much more besides - recommended!) - http://fas.sfu.ca/cs/people/ResearchStaff/amulder/personal/
webpages.html


Deutsch, S. (1996) The Composer in the Age of the Internet (http://mac.bournemouth.ac.uk/dms/articles/art1.html)

Norman, D. A. (1986). Cognitive engineering. In D. A. Norman & S. W. Draper (Eds.), User centered system design: New perspectives on human-computer interaction . Hillsdale, NJ: Erlbaum Associates.

Copyright © David Jennings

 

Site Map
Image - who we areImage - what we doImage - this site offers

Personnel & Connections
David Jennings
Our Associates
Contact Details

Online Portfolio
Our services
Our Projects

Our Approach
Target Market
Network of Partners
User-centred Approach
Code of Practice

Image - click for site map
Image - filler for layout purposes
Page last modified on 1 December 2000. Comments to David at david@djassociates.com