Thursday, May 22, 2008

Research Project

Many thanks to friend of OPC Joe for sending along the link to a rather sub rosa effort called the Whitburn Project, which for a few years now has been cataloging and collecting data on Top Forty hits. (Tips and links, by the way, are always welcome around here, as are outright cash gifts.) The researchers have evidently assembled a spreadsheet containing data on some 37,000 songs, which makes some of the stuff we've done around here look like kindergarten.

The project has been housed on Usenet, which is in large part, to quote Leonard Cohen, just a shining artifact of the past, but that's been to the benefit of the research, since copyright issues have dictated that it move below the radar. The analysis of the data, though, remains in the public domain. One of the first slice-and-dices of the materials that I've seen looks at something we were talking about the other day, the length of the perfect pop song. Which running time produced the most Top Forty hits in each decade? Here's the list:

1950s, 2:30 (95 songs) (e.g., "Jailhouse Rock")
1960s, 2:30 (250 songs) (e.g., "The Loco-Motion")
1970s, 3:30 (153 songs) (e.g., "Rock Me Gently")
1980s, 3:59 (142 songs) (e.g., "Tempted")
1990s, 4:00 (132 songs) (e.g., "What's the Frequency, Kenneth?")
2000s, 3:50 (58 songs) (I got nothing, sorry)

Amazing, isn't it? Remember, these are medians, not averages, so that in the Seventies, songs of exactly three and a half minutes were more likely to chart than songs of any other length. I wonder if there's something in the human brain that is conditioned to respond to things in exact minutes or half-minutes. Probably not.

7 comments:

Scraps said...

God I love the Whitburn project, though my version is a few years out of date now. It is an astonishing compendium.

MJN said...

Strictly speaking, those song lengths are modes, not medians. Of course the preponderance of half-minute lengths is coincidental.

Gavin said...

I would guess that it's not coincidental. I don't know whether the data reflects actual running times or what's printed on the disc.

If the latter, I suspect a tendency to round off songs at a nice round number (half-minute marks through the 70s, then an effort to keep songs just below 4 minutes in the 80s, then an abandonment of that effort in the 90s).

If it the former, I suspect engineers actually reaching (consciously or not) for those lengths when they decide how long a fade-out's going to be. It doesn't take a lot of people doing either those things to skew the data: a small minority will do the job, if everything else remains random.

Tom Nawrocki said...

It's the running time from the disc, so your former rationale is probably in play here. It's very easy to imagine people timing a song, seeing it come in at 2:29 or 2:31, and consciously or not, writing down a nice round 2:30.

And yes, I meant mode. My bad.

MJN said...

I agree with you, Gavin. If the song lengths weren't subject to manipulation, then I still say the half-minute modes would be a coincidence. But as you say, those numbers can be fudged in various ways, and so their characteristics differ from what you would find in a collection of truly random data.

MJN said...

Thanks for bringing the Whitburn Project to my attention; I have enjoyed learning about it. It’s unquestionably an impressive piece of work, having required thousands of man-hours to compile, and inspiring (I suspect) thousands of man-hours of analysis. However . . .

Here’s the great thing about music. As massive as the Whitburn Project is, it’s not even the tip of the iceberg. You can limit the scope of your discussion to songs that have charted, or to Rock & Roll Hall-of-Famers, or in some other way. For a project as detailed as Whitburn, you MUST limit your scope like that. But so much is left out when you do that. Did “Won’t Get Fooled Again” hit the charts? Is Terry Jacks in the HoF? I doubt it, but if you’re reading this, you know instantly how relevant (or not) they are to any particular discussion at hand.

Look at it this way. The Whitburn database is roughly comparable in size to the Baseball Encyclopedia: several hundred songs per year, compared to several hundred player records each year, for 100+ years. But where the Baseball Encyclopedia is complete with respect to Major League Baseball, the Whitburn Project is woefully incomplete with respect to popular music. If MLB is the equivalent of Billboard’s Hot 100, then all of popular music might equate to all baseball played anywhere at any level. The difference is, you can’t make a case that a player whose record begins and ends with Little League is as important as Bill Melton, but you can argue that Lyle Lovett is better than Madonna.

I don’t mean to denigrate the Whitburn Project here; I wish I had a few hundred free hours myself to play around with it. Rather, I celebrate the awe-inspiring universe of recorded music.

Tom Nawrocki said...

"Won't Get Fooled Again" actually went to Number Fifteen, so it's in the Whitburn Project. "Baba O'Riley" did not chart, if you want to sub that one in instead.

Bill Melton is better than Madonna, too.