Generative Music, and Why Every Developer Should Know about it.

(from davidorr.net)

"Generative music" is a term that has been buzzing in the game industry the past few years. If you've played any one of the many AAA titles released recently, you've probably experienced it*. And -- if it was done well -- you likely didn't even notice it. So, what exactly is it?

Generative music is, when boiled down to its very essence, music that changes and transforms. Typically, it is mapped to the actions of the player -- be it combat, movement, or interactions. Alternatively, it could be mapped to change with the environment (for instance, day/night cycles), or any variety of parameters. The goal it to provide a deeper level of immersion for the player by having a soundtrack that adapts to what is on screen.

The most common approach is to have multiple "layers" of a track that can be toggled on or off depending on what is happening in the game. Take, for instance, a first-person shooter. Imagine: you're lurking through the woods, hunting a target. A soft, pulsing atmospheric track is playing in the background. As you approach your prey, the track increases in volume, with some light percussion fading in to give the music a stronger rhythmic pulse. Just then -- you're ambushed. Now a more intense track fades in on top of those two, complete with aggressive synths and booming drums. In this scenario, the music is mapping your every move, and transforming a traditionally static element of a game (the music) into an additional vessel of immersion.

I write about this for two reasons. First and foremost, generative audio is something that I am passionate about and have been working with for several years (starting with the flash game "Colony"). Second, because I believe many game developers (especially indie) are unfamiliar with this concept and/or hesitant to implement it. This is understandable -- developers are developers, not composers. Implementation requires additional coding, or licensing of an audio engine with generative capabilities. And, not all composers are comfortable -- or even familiar -- with the concept of generative audio. It requires extra time, skill, and forethought to conceive and compose. These are real-world concerns, and ones that only the developer can contemplate and address.

Let me present you with a real-life, personal example of generative audio in action. I recently finished the soundtrack to Gemini Strike with long-time friend and collaborator Krin. This game features a very simple implementation of generative music. It has two layers -- an "atmospheric" layer that plays when you are between battles (in the menu, buying items, etc.), and an orchestral layer that plays when you are in battle. The two were composed on top of each other, and you'll often hear them seamlessly fading between each other. The results, however, are far more immersive than the standard "menu loop/battle loop" setup. The soundtrack never stops, but instead moves with the player -- replacing traditional aural seams with a far more elegant solution.

This isn't meant to be a sales pitch for Gemini Strike, nor is it a sales pitch for my services (but feel free to contact me anyway!) Instead, I write this to address a topic that is much deeper. In a game industry that is rapidly changing, it is important to consider progression on all fronts -- including music. Just as a bad score can ruin a great movie, a poorly implemented soundtrack can hurt a great game. When a player turns off the music in favor of their personal playlist, the soundtrack has failed. A soundtrack should be an essential part of the game -- something that players miss when it is turned off. Generative music is a considerable step closer to achieving that goal, and something that every developer should consider.

* Edit: Some would call this "adaptive music". I regard adapative music as falling within the category of generative music. Perhaps another topic for another post. :)

Comments

Log in to Comment

Step 2014-09-18 15:21:38

Everything you say is just an invaluable piece of insight, David. PLS TEACH ME.

I knew about this kind of game soundtrack technique but didn't know it had a name. Great read!

DavidOrr 2014-09-18 15:21:38 (Updated 2014-09-20 22:42:54)

It does indeed have a name! Though I'm not sure it has become the "industry standard" term yet. I'm gonna stick with it.

As far as getting your feet wet, here is a speed-run through writing generative audio.

----------------------
Preface: If you've ever written and rendered musical stems (for film, animation, Royalty-free sites, libraries) you've essentially created generative music. If you haven't;

1. Write a piece of music. Don't worry about any of this generative music stuff.
2. When you're done, play around with muting select tracks, so you are only hearing select parts of the music at once.
3. Find a combination of parts you like on their own (maybe percussion and bass?), and render them.
4. Now render the remainder, WITHOUT the instruments in the first render.
5. You know have 2 stems. Open up a new project file in your DAW of choice, and place them on two separate audio tracks. Don't worry about tempo syncing -- just line them up so they begin at the same place.
6. Play the piece, and experiment fading between the two tracks -- have one track silent while the other is playing, and then switch. Automate fades in random places, and just have fun experimenting.
7. Congratulations -- you've just emulated a AAA generative music system in your DAW!
----------------------

Perhaps the tutorial was unnecessary (and AAA systems are more complex than that), but the concept really is that simple. I've been experimenting with 2-4 different "stems" (portions of the full track), and seeing the results I've gotten. There is a very different compositional process to writing generative music (perhaps something I'll write about in the future), but this is a good way to get started. :)

Phonometrologist 2014-09-24 13:26:38

Thank you. Very insightful. Upon reading your reply to Step, I can only assume the other approach is in modulating the main theme-- as the main theme changed, so does the composition itself.
Perhaps that seems like a no-brainer.
But I do wonder how the developer programs the music to transition well as you'll have the main theme audio looped and then to make it transition to the new audio without hearing the change.

DavidOrr 2014-09-24 13:26:38 (Updated 2014-09-24 21:42:57)

The easiest approach (and the one used by most in the industry), is to forego loops altogether. You essentially have a 3 or 4 minute track split up into multiple parts, and each part is triggered on/off based on a set of pre-determined conditions. This is called "adaptive music", which I consider a form of generative music. This works pretty well, but (as you suggest) it can be rough on thematic material. Fading a theme in half-way through its statement is not ideal. Luckily for the western market, composers have moved away from thematic writing in favor of a more cinematic underscore style. This works perfectly for this type of generative audio.

With that said, I've been experimenting with having shorter music sections (30 - 40 seconds) that can be pieced together based on a set of music mapping conditions. For instance, take hypothetical section A, B, C, D. Once a section plays, it can move onto a new section of music:
A: can proceed to B, C
B: can proceed to A, D
C: can proceed to A: B
D: can proceed to any section

Each section has multiple layers ("stems"), which can be faded based on game parameters set by the developer.

With a system like this, you can keep the integrity of themes, because you can program the system to play the thematic material of a section in its entirety. Essentially, the system then "knows" where the themes are located, as it has a skeletal framework of the music (defined by the sections). Furthermore, you can now write in modulations (that aren't the same every time), and add an additional level of variability to the soundtrack. If a section ends in a different key than it began, just have it move to a new section in an appropriate key. No need to stick with D minor! :)

There's a lot there, and it really needs to be elaborated on to make full sense. I'll make an in-depth post in the future about my approach to generative audio. Krin and I are working very closely on bringing something new to the generative audio field (what I shamelessly refer to as "next-gen generative audio"). Most of the games we release now use some form of generative audio, and each iteration is getting more polished and diverse.

Krichotomy 2014-10-03 16:10:13

I've noticed this in several games (mostly the crossfading method, but also) and have thought about how I might implement it in a game myself some day, but I never knew it had a name. :D

Honestly, the term "generative" music makes me think of the music being generated on a more basic level, such as playing notes in response to character actions. The best example I can think of is the game Solace: https://www.youtube.com/watch?v=J8U7aOTQff8

I think an interesting example of a more abstract-style generative/adaptive music is found in Proteus: https://www.youtube.com/watch?v=xvnRX2np2HQ

Thanks for the informative writeup!

DavidOrr 2014-10-03 16:10:13

Using the term "Generative" vs "Adaptive" is more semantics than anything else -- I recently updated the post to attempt to clarify a bit. Currently, a lot of people in industry are referring to what I describe as adaptive music, with the implication that generative music is more centered around note generation.

Personally, I see adaptive music (as seen in some modern games) as a first-step towards truly generative music. Generative music can be determined by a set of rules, and there is no reason why groups of notes cannot be treated as a single event in the generation process. The generative aspect, therefore, is not within the notes themselves, but within the combination of layers in the music.

Again, this is purely semantics, and I'm not sure yet if it is worth going against the terminology of the industry on those grounds. I feel I can make a compelling case for using "generative" over "adaptive" -- but at the end of the day, does it really matter?

Thanks for your input!