Guest Post: Recording and Producing a Pop Song

Today’s guest post comes from the epic Trek VanderWaffle, Burak Kanber (who’s website is worth checking out at http://burakkanber.com/), a musical mastermind who had dived, delved, dug, distinguished, discerned, and dissected what pop music really is all about:

What makes pop music pop music? Clearly there is some set of elements that combine to make a pop song distinct from rock or blues music–but pop music is more than just a genre, it’s popular. The “Top 40” is named thus because they’re the literal top forty most popular songs in the country in a given week. But why? Many people have little respect for Top 40 tunes and the artists that create them. How is it that pop is hated by so many, but still so popular? Spoiler alert: it’s not because “most people are stupid” (the most frequent response I heard in an informal survey).

The “pop music” question was a burning one for me, so I decided to analyze pop music and learn enough about it to record and produce a pop song. While I am a musician, I’m not a recording engineer–so I’m starting from the ground up on this one. My purpose here is to share what I’ve learned about pop music and the recording/producing process.

Pop Hatred

I used to hate pop music. Like many of you reading this, I assumed that pop musicians were largely talentless and that the music itself had no real value. Then Lady Gaga became popular, and I noticed something I hadn’t noticed before: her music seemed very deliberate. I don’t mean to say that it sounds “forced”, but rather that there are some interesting characteristics to her music that will typically go un-noticed, but could have only happened after careful planning and deliberation.

One example is the melody of the song “The Fame“: the second note of the melody (“I can’t … “) is off-key. This wasn’t an accident. In fact, singing that wrong note is harder to do than singing the correct note.

(Technical detail: the third chord in that progression is the minor third, but the vocal melody sings the major third. This dissonance isn’t natural, and the human voice is naturally drawn into key, which makes that particular note harder to sing than the “natural”-sounding one.)

Why did Lady Gaga do that? I didn’t know, and that’s what sparked my journey into the analysis of pop music. Along the way, I learned that pop musicians are actually very talented: Lady Gaga is an all-around amazing musician (duet with Tony Bennett as proof), Katy Perry was a successful Christian rock singer with a sweet voice, and Ke$ha was a talented country singer who aced her SAT.

I eventually realized that pop music is a necessity, not a plague. Author Raph Koster helped me realize that all forms of media can be used for both artistic and entertainment purposes. Art and entertainment are attached at the hip; entertainment is simply art that you don’t have to think about to understand. Our brains are pattern recognizing machines, and the process of appreciating art is based on learning a new pattern you’ve yet to encounter–learning that new pattern triggers our brain’s reward mechanism, which is why consuming art is pleasurable in the same way that learning a new skill is.

Entertainment, on the other hand, is based on patterns we’re already familiar with. (A quick Google search reveals that this isn’t a ground-breaking theory.) We don’t need to do any heavy processing to understand entertainment. And we need both art and entertainment. Not everything can be beautifully artistic, lest our brains grind to a halt trying to process all the new patterns.

Imagine if every movie were a Darren Aronofsky movie. If every billboard advertisement were designed by Dali. If every book were written by James Joyce. If there existed no cheap landscape paintings to adorn your dentist office with. (Some readers may be thinking “that’d be awesome!”, but I assure you, it would get tiresome very quickly.) We readily accept “pop” in those media. We might not love Michael Bay movies, but we admit we need them when we want to watch something mindless and take a load off. For some reason, though, we haven’t fully accepted pop music in the same vein.

I argue that we need pop music. We need something to dance to, to listen to in the car, and to have on in the background at work. Fans of pop music aren’t stupid consumers; they’re just not music snobs. Fans of pop music like the freedom of listening to music without having to think about what it means.

Recording a Pop Song

Eight years ago, I wrote a rock song and recorded it with bandmates Anthony Babino and Dan Gallagher. Anthony and I decided it would be fun to re-record that rock song as a pop tune, and that’s exactly what we did. I’ll reiterate now that I’m not a trained recording or sound engineer, and that all of the mixing and producing skills I know were learned both shallowly and very recently. The following is simply a set of preliminary observations I’ve made.

I now offer to you my Grand Theory of Pop Music. It states that a pop song must adhere to two rules: simplicity and uniformity.

Simplicity

Pop music should be simple entertainment. Every aspect of a song should be simplified:

  • Song structure: most pop songs have either three or four parts: verse, (optional) pre-chorus, chorus, (optional) bridge. Compare this to Queen’s “Bohemian Rhapsody”, which has three mini-songs inside it (the ballad, the opera, and the rock), each of which having its own subsections.
  • Chord structure: most pop songs use 2-4 chords. Compare this to classical or jazz music (~20 chords), or rock music (4-8 chords). Blues and country music may only use 3-4 chords, but also include lengthy and complex instrumental sections, such as guitar solos, harmonica, fiddle, banjo, etc.
  • Melodic complexity: pop music melodies are much simpler than other genres, averaging only 5 unique notes. Compared to classical, jazz, blues, and rock melodies, there’s an astounding discrepancy.
  • Melodic range: most pop singers stick to a small vocal range (usually less than an octave). This may be the weakest guideline in this list, because some pop singers (Adele, Christina Aguilera) use the full range of their voices (Adele’s “Someone Like You” uses an octave-and-a-half vocal range).
  • Lyrics: I doubt I need to mention this, but pop music lyrics are simple, repetitive, and feature themes of love, sex, and partying.
  • Rhythm: there are almost never complex, syncopated rhythms in pop music. The “four on the floor” beat (bass drum on downbeats, sometimes with hi-hat on upbeats: “nn – tsssk, nn – tsssk, nn – tssk, nn – tssk”) is very common because it encourages dancing. We relate the kick drum to our lower bodies, and the cymbals to our head-space. The drum beat then becomes a subtle psychological lesson in dancing: move your feet to the downbeats, move your upper body to the upbeats.

Note that simplicity is not the same as minimalism. A minimalist song (perhaps just acoustic guitar and vocals) can be very sophisticated, and a busy or complex song can be very simple. Many rock songs have roughly 8-10 layers in the mix, but are still more sophisticated than pop songs–some of which have up to 40 layers in the mix by my count.

Pop music uses simple elements (lyrics, chords, structure) to make the song easy to digest, but pop songs often add many layers to fill out the sound and make it feel three-dimensional. These extra layers are often very hard to pick out of the mix unless you listen very carefully, but still add to the texture of the song. That’s how producers take a simple song and make it less boring: by adding texture.

In re-recording our song and turning it into pop, I had to pull a few quick tricks to simplify it a bit. The original song uses eight chords, which is a bit too complex for pop. Instead of throwing out perfectly well-behaved chords, I crafted a synth bass line that concentrates on just one chord, instead of moving with the guitar part. The pop version has five chords left, but it “feels” like only three because of the way the bass line plays off of those chords.

The original also features a complex rhythm in all instruments. That’s an obvious no-go for pop, so I replaced the drum part with a four-on-the-floor beat, and used the bass line to play off of the drums. One characteristic of a “tight” sounding song (as all pop songs are) is strong collaboration and synchronization between the kick drum and the bass line.

We also simplified our lyrics a little bit; in fact, we stripped out the second half of the chorus and just repeated the first half with some variation. We did keep the vocal range and melodic complexity, but that’s because we’re going for more of a Maroon 5 sound than a Ke$ha sound. Adam Levine is an incredibly talented and technical vocalist (like our Anthony Babino), so he (and we) can get away with more complex vocals.

Uniformity

Think about the flavor of a McDonalds cheeseburger. Try to remember the taste of a Budweiser. Now try to remember how a deli sandwich from down the street tastes. While the deli sandwich may have been much better than the Quarter Pounder, your recollection of flavor of the burger is probably much clearer. A McDonalds burger tastes the same anywhere, whether you’re in New York, California, or Tokyo. Companies like McDonalds and Budweiser go to great lengths to make sure every product tastes exactly the same.

Gourmets tend to hate that uniformity, but it’s used by big companies to make a product feel familiar and consistent. You know that Budweiser is an “ol reliable” no matter where you are. Pop music should follow the same convention: a song should sound the same no matter how it’s played. Play a pop song and a rock song on a hi-fi speaker set, and then play those same songs out of your crappy laptop speakers. You probably “lost” a lot more from the rock song than from the pop song in the transition. The uniformity of pop songs can be broken down into two categories: dynamics and frequency response.

Dynamics

Newcomers to classical music appreciation have all made the following mistake: you’ll load a Beethoven or Rachmaninov piece onto your hi-fi stereo system and press play. The song is very soft, so you crank up the volume and sit back and enjoy. This lasts for 5 minutes, when suddenly a quick crescendo hits and your ears are bleeding from the house-shaking volume. Classical music is dynamically expressive like that. So is jazz. Rock and blues are typically less dynamic than classical or jazz, and pop music is the least dynamic of all.

The dynamics of classical music are part of the experience. You’re supposed to strain to hear the quiet parts, and the loud parts are supposed to be really loud. You’ll notice that newcomers to classical music are constantly fiddling with the volume knob to equal out the levels, until eventually they realize that the dynamics are part of the piece.

That volume-fiddling is a manual form of a process called “compression”. Compression reduces dynamic range by automatically making soft parts a little louder and loud parts a little softer. Pop music uses heavy compression to remove dynamic range as a factor, both to make the song a little simpler to listen to and to make it more uniformly deliverable through a range of speaker and listening setups. Radio music is further compressed to help with transmission over the air. Fun fact: a lot of thought by recording engineers goes into what happens when someone listens to music while driving a car. Engineers put the more subtle parts in the left channel so that the driver can hear them better. An additional benefit of heavily compressing radio music is that even the softest parts of the song can still be heard over road noise, because the volume of those parts is artificially raised.

Frequency Response

When mixing tracks in a song, each is individually EQ’d (equalized) to control the dominant frequencies of that track. Low frequencies, between 20-250 Hz, comprise the “bass” range of a song. If you want your listeners to feel the kick drum’s thumping in their chest, you might want to exaggerate the 50 Hz frequency band. If you want more “air” in a song, you can exaggerate the 16,000 – 20,000 Hz range. If your vocalist has very defined “S” sounds (this is called “sibilance”) that you want to minimize, you may want to remove some of the 8,000 Hz frequencies from the vocal track only.

Equalizing tracks is the recording engineer’s #1 method of controlling the interplay and intricacies of the sounds in a song. It’s good practice to give each instrument or track it’s own little range of the frequency spectrum, so that no two instruments “step on each others’ toes”–which makes a song sound unclear or muddy.

The bass frequencies are typically the hardest to control, especially where uniformity is concerned. Speakers have different levels of efficiency in “driving” (or producing) sound at different frequencies; generally speaking, large speakers are more efficient at driving bass frequencies, and small speakers are more efficient at high frequencies. Hi-fi audio systems can have four different types (or sizes) of speakers: subwoofers (for 20 – 200 Hz), woofers (for 40 – 1,000 Hz), mid-range (300 – 5,000 Hz), and tweeters (2,000 – 20,000 Hz). Those systems are great for the people who have them, but most people listen to music in their car, through headphones, through cheap(ish) computer speakers, or through crappy laptop speakers.

You’ve probably noticed that your laptop speakers sound very “thin”–that’s because they’re tiny, and are terrible at moving air at low frequencies. Laptop speakers can produce almost no bass, and that’s disastrous for someone with the goal of making music as uniform as possible. So what’s the solution? Remove all the bass from your song! Rock musicians’ heads everywhere are spinning right now, but this is common practice in pop music.

The extent of most pop music’s bass presence is limited to the kick drum, and even then there’s a fallback mechanism in place. Pop music exaggerates the kick drum’s 50 Hz range (the part you feel thumping in your chest), and also exaggerates a much higher frequency (perhaps around 4,000 Hz). Why would you bring out such a high frequency in what’s supposed to be a low frequency instrument? The high frequency component of the kick drum is where the actual sound and recognizable tone of the drum lives; by exaggerating those frequencies, you can still hear the kick drum “sound” even on laptop speakers. The laptop speakers may not thump your chest, but you’ll still recognize the distinct sound of the kick drum’s beater hitting its drum skin.

Uniformity concerns are also why you don’t hear a lot of bass guitar in pop songs. The bass guitar also lives in low frequencies, and will also be lost through crappy speakers. Most pop producers opt to use a bass synthesizer instead. Perhaps the coolest thing I noticed was in the song “Moves Like Jagger” by Maroon 5: the bass synth’s track is EQ’d so that there’s no low frequencies in it, but the kick drum is EQ’d to be very resonant and full sounding– and the bass synth is synchronized to the kick drum. While the bass synth has no low frequencies of its own, it rides on top of the kick drum’s low frequency resonance, tricking the listener into thinking the the bass synth occupies the low frequencies as well. Try listening very closely, and you’ll see what I mean. Recording engineers use that technique as a type of risk management; they can’t guarantee that you’ll be listening on good speakers, so they just remove as many low frequencies as they can (in most cases, all but the kick drum), and use aural tricks to give you the illusion of fuller instruments. Very clever.

Problems

So here we are, and we’ve learned a lot along the way. It’s now time to try and create our own pop song. Here are some problems I ran into while doing this quick demo:
I’ve been learning about recording engineering for exactly a week, so I still suck. The track below came out OK, but it’ll definitely be a lot different in a few months when I’m better at this.
I had a lot of trouble with the synths. I haven’t yet found a bass synth or synth pad sound that I love, and so I’ve actually ignored my own suggestion above about limiting low end frequency in the bass. I ended up using a bass synth that sounds less electronic and more like a bass (though it’s still somewhere in between). I hate the pad sound I ended up with in the chorus of the demo.
We had no clue what we wanted this song to evolve into when we recorded the guitar and vocal tracks, so it’s actually a lot slower than we want. When we re-record the full version of the song in a month or so, it’ll be about 15 BPM faster.
The original song was recorded in high school, when we really had no clue what we were doing. The recording quality is pretty poor as a result. The new one is poor too, but a marked improvement!
The original. Note the dynamic range of the song — loud all around because it’s a rock song, but there’s still clearly a section that’s much softer than the rest. Compare this version’s shape of the waveform to the new version, which is a much flatter and louder all the way across:

Original

The remix:

Remix