Introducing ELVIS, the Computer-Driven Music Analysis Research Project

Summary: I discuss the basic implementation details of a music-analysis computer program I'm helping to write. Our goal is to find musical patterns that will help us describe musical style more precisely. You can visit our temporary website at http://elvis.music.mcgill.ca (not much to see) or if you're reading this in the future and that doesn't work, you can view our new website at http://elvisproject.ca, where there will be lots to see. You can view our code (AGPLv3+) on GitHub here: https://github.com/ELVIS-project

One of the most useful things I've done over the past couple of years, at least in terms of learning about computers and programming, is to read the blog posts written by members of various free software communities. Is it about software I don't use? Is it too technical for me? Is it not even about software, but the larger ideas of the community? Is it written in barely-comprehensible English? Doesn't matter---everything is useful and interesting, and I'm thankful for the opportunity to learn from other people who are willing to share. Now it's my turn to share. This post is the first in what I hope will be a set of posts about the "ELVIS" research project I've been working on.

First, if you didn't know, I spend my academic life as a musician. Actually, I'm a music theorist, which is a music research discipline that tends to focus on the aspects of musical works that can be observed in a score. Few scholars stay strictly within the confines of a single discipline, and the same is true for me: I'm often also a philosopher, computer scientist, historian, ethnographer, and other things. Not everybody agrees, but I think this kind of mixing is crucial to the formation of new knowledge (or "knew nowledge") for reasons that I hope to discuss in a future post that's not about ELVIS. Oh right---this is a post about ELVIS.

Now you know it's about music, nobody should be surprised that ELVIS is actually a backronym. It stands for "Electronic Locator of Vertical Interval Successions," which nobody understands without explanation, but it's pretty easy. "Electronic locator" just refers to the fact we're using a computer program, and "Vertical Interval Succession" just means we're... well that's a little harder. In music, and "interval" is the pitch distance between two notes (or frequency difference between two pitches---same thing). "Vertical" means both notes are happening at the same time rather than one after another. "Succession" means we're looking for the orders in which one vertical interval follows another. Totally straight-forward, right? Here's an example, just in case.

Imagine it's the year 1175 or so, and you're a monk. The lead monk is going to sing a prayer with a well-known tune. You have to improvise a second part to go with it, and becuase you're not an idiot, you want to make sure it sounds good. How do you know which notes to sing? As far as we know, monks around that time period would have improvised the second part by knowing the order of the intervals formed between the well-known tune and the newly-improvised accompaniment. Only certain successions of intervals would sound appropriate, and it depends on what the prayer is about, where you are in the tune (beginning, middle, end), and many other things. As more and more singers were added, as instruments were added, as notation was invented, these interval successions remained important. Today we call it "counterpoint," and whether or not listeners or singers or composer or songwriters pay attention to it, it's present in virtually all Western music. Yes, even "Call Me Maybe."

Last paragraph about music! We're using a computer to analyze countrapuntal patterns (i.e., patterns of counterpoint). We hope to find patterns we can use to distinguish or unite periods of musical style. If you're like a lot of music scholars, you're probably thinking something like "but what about rhythm?" or "but what about chords?" or "but what about melody, form, timbre, metre, tuning/temperament, social situation, and other factors?" Yes, they're all important, but we're just starting out! There have been many smaller studies focussed on melody, but for various reasons, they haven't lived up to their promise of finding patterns that distinguish musical styles from each other. That's why the ELVIS project is trying counterpoint. We hope contrapuntal patterns will help us find some basic statistical methods and analytic strategies that are effective when talking about musical style. Later, we can supplement or start over with other musical elements, and we'll have a good idea of where to begin.

For those of you who are computer-minded, here's the interesting bit. In the two years of the grant, we've rewritten our analysis program (called "vis") from scratch four times. Each time, we used a different strategy when designing the back-end. The first iteration was a very limited commandline program that only analyzed two-voice contrapuntal patterns, and provided output in the form of lists. The second iteration added a GUI and graphs---both certainly required for music researchers---and we started to modularize the back-end by actions the program needed to perform. In the third iteration, we tried an MVC (model-view-controller) approach with the same GUI. This was driven by four controllers that each corresponded to a stage in the analysis process: Importer, Analyzer (find stuff in scores), Experimenter (statistics), and Visualizer. We also started to envision a Web-based interface, and use cases other that two-part contrapuntal patterns (the only other one implemented was for chords). Our fourth iteration was designed from the start to be a modular framework, separated from any interface, extensible for any use case. Although we're focussing on the modules to analyze contrapuntal patterns, we've put considerable effort into making it easy to write additional modules that could potentially analyze anything. As long as it starts with a score and ends with statistics, we hope you can use our software!

I'll briefly describe the architecture of our back-end. We have three types of components: analyzers, controllers, and models. So far we plan for only one controller, called WorkflowController, which more or less coordinates running through the four stages of our previous architecture: import some scores, analyze them for patterns, calculate some statistic, and output the data. Using the WorkflowController is optional; it's a tool we're using to make our often-run queries easier to run. If you don't use the WorkflowController, you'll primarily be interacting with our models, IndexedPiece and AggregatedPieces. They manage all the data of a single piece and a collection of pieces, respectively. We have many analyzers, since it's the core of analysis activities, and since we envision most future expansion will happen in this module. Each analyzer does only a single action, and you can run them in any order that produces valid output. But users won't interact with analyzers directly. Rather, they use the get_data() method of a model to run analyzers in a specific order, with certain settings, and the model will ensure everything gets done properly. This level of separation is important, since it will allow our models to do results caching and other interesting things---heck, there's no reason to even stay in the same programming language, as long as the right data comes out in the end!

The last topic for this post is to describe the core software we're using to help build our framework. We're grateful that Myke Cuthbert, developer of the music21 library, is working with us on the ELVIS team (see http://http://mit.edu/music21/). music21 basically provides a way to import and represent a wide variety of file formats with a relatively consistent set of Python objects, plus a sizeable collection of analysis tools. Once we've used music21 to index our scores, we use the pandas data analysis library (see http://pandas.pydata.org/) to organize our data and help us perform fast analysis activities with vectorized NumPy operations. pandas will also let us store analysis results as pickled objects, export for use by other programs, and a whole host of other things we haven't thought of yet. For the desktop versions of our applications, we're using the PyQt library (see http://www.riverbankcomputing.com/software/pyqt/intro), which I've really grown to appreciate. The signals-and-slots mechanism, even without a GUI, is a really great idea, and I can see why many of the other features are immensely useful for C++ developers even though they sometimes cause headaches for Python programmers. Finally, for the new Web interface, we're using django (see https://www.djangoproject.com/) and Knockout.js (see http://knockoutjs.com/).

In the future, I'll write about some of the research we've already done with vis, about how the program's architecture works, the other software we're using (including LilyPond and ggplot2 in R) and many of the other lessons we've learned along the way. In case you're wondering, I do all my development on Fedora. Through the life of the project, I've moved through Fedora 16 through 19.