Thursday, November 28, 2013

Artificial intelligence: Coming soon to a stage near you

Some things are naturally polarizing; people tend to have very strong opinions about things like jazz music, operating systems, and the cats-vs-dogs debate. I know I can't do much to convince you that cats are simply better, but I think I found something that will appeal to both sides of the jazz debate: GenJam.

GenJam is a very special jazz musician; it's a really clever computer program that improvises jazz music in real time. Its name is short for Genetic Jammer: it jams with RIT professor Al Biles using musical phrases developed with a genetic algorithm. It's probably my favorite use of AI because 1) it's impressive that it works at all, 2) it's not a pretentious project, and 3) I don't really know many uses of AI, so my selection pool is pretty small.

AI mostly just makes me think of robots, but everyone knows about robots so that's not as interesting

GenJam was developed by Biles in 1993 and has been consistently improved and advanced ever since. It is now capable of improvisation in over 300 songs (including a bunch of Christmas music, in case you need to book an act for an upcoming Christmas party). If you want to see the kinds of things it's capable of, check out this video of Biles and GenJam having an impov session. Those solos weren't preset in GenJam's memory; it had to learn how to create a good solo and use those techniques on the fly, the same as any human musician. 

Before talking about how GenJam has learned how to play jazz, we should talk about genetic algorithms first. That's how GenJam does its learning, and it's probably not something you've done before. AI-Junkie has a fantastic introduction to genetic algorithms if you're interested in the details, but chances are if you're reading this it's for a grade and not because you actually care about what I have to say. So I'll paraphrase the important parts. All information credit is to AI-Junkie, though. 

Unless you slept through seventh grade, you probably know how genes work in nature: mommy and daddy have separate DNA with slightly different chromosomes; these chromosomes can cross over and create new chromosomes for their offspring that are combinations of genes from both parents. The offspring then mate and mix up their own DNA and novel populations are created. 

Chromosomes cross over and mix DNA. This looks like a bad MS Paint drawing but it's apparently an official McGraw-Hill image. I should apply to their graphics department.

Mutations occasionally happen, changing one or two G's to C's and so on. Sometimes this causes a noticeable change in the organism; sometimes it doesn't. Some of the changes that occur (both through mutations and mating) give the organism a slight advantage: they can run faster, jump farther, and make cuter faces at humans. All of these help them survive and give them a better chance of passing on their genes to another generation, so over time the most helpful genes tend to survive while the less helpful ones become much less common. 

This exact process is modeled in genetic algorithms. "Chromosomes" are the possible solutions to a problem. They're encoded as strings of several "genes," representing a part of the solution (for example, trying to find all English words that can be made with five letters? Your "chromosomes" will be five-letter combinations and each "gene" will be a letter.) These chromosomes can cross over and swap genes. You start with a population of n chromosomes made up of random genes. For each chromosome, you see if it's any good at solving your problem and give it a "fitness score," assessing its usefulness (sticking with the five-letter-word example, any chromosomes / words with only consonants or only vowels isn't likely to be a word and will have a super negative fitness score. Things with impossible consonant combinations will have a slightly less negative score. Actual words will have the highest possible positive score). You breed your chromosomes in a way so that ones with higher fitness scores will be more likely to reproduce and ones with lower scores are more likely to die off (so ZARPB will breed since it's sort of word-like, but ZXSDF should get axed pretty early). You keep doing this until you decide you're done, at which point you should have found a bunch of solutions to whatever problem you're working on (in our example, we'll stop after we've checked all 5^5 possible chromosomes and keep the ones with maxed-out fitness). 

Really, though, just read the intro from AI-junkie. It'll make more sense than I ever could. 

For those of you who like pretty pictures, genetic algorithms basically go like this: 

They seem much easier in flowchart form...
Still with me? Maybe? Okay cool. That's the general idea behind genetic algorithms; here's how GenJam applies it. (note: all following information is from the original 1994 paper announcing GenJam to the scientific community. More on that in a bit). GenJam has two separate populations to build up: one for individual measures and one for four-measure phrases. It's not trying to select one perfect solo; instead, it's trying to get a feel for what sorts of measures and phrases work well for the song GenJam is learning. Fitness scores for measures and phrases are determined when GenJam plays them to a "mentor," who gives feedback by pressing "g" for good and "b" for bad. Any given phrase or measure can be "upvoted" or "downvoted" more than once, so particularly bad measures can be given a score of "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb" and hopefully never be played again. There are several ways of mutating measures and phrases, including sorting the notes and reversing their order. With the mentor's feedback GenJam learns how to create good solos for each song in its repertoire. 


The basic layout of how GenJam works

To actually play, GenJam is first given a file that will tell it the tempo of the piece, the style (straight or swing), the chord progression, and what's happening in the song  [PDF warning] (for example, "four bars intro, Biles solos for eight bars, GenJam solos for eight bars," etc -- check out slide 14 from that link to see what I'm talking about). It also gets the midi tracks for any background music (drums, pianos, bass, etc). During a performance, GenJam and Biles listen to each other and improvise on the fly, like two human musicians would do at a jam session.

Biles and GenJam jamming

All the information for GenJam's specific implementation (and the source of the information for the last few paragraphs) is here in the paper Biles wrote for the 1994 International Computer Music Conference. It really is worth checking out: it's extremely accessible, especially as far as scientific papers tend to go, and is an honestly interseting and engaging read. 

Biles did a TEDx talk last year about GenJam, if you'd rather watch and listen instead of read:



This, I think, is one of the coolest uses of artificial intelligence. Not because it's super technologically relevant or useful (I'm firmly in the "the world doesn't need more jazz music" camp) but because music is something we consider uniquely human. And a computer is learning it. It's a different kind of scary than we're used to, and it makes you think. Which, really, is the whole point.

__

Credit where credit is due: I would never have known about this project if one of my professors wasn't really into computer music and always excited to share cool links with the class, so stop by Computer Music Blog and check out Professor Merz's stuff.

No comments:

Post a Comment