Hip-hop has always been a contentious conversation, an everlasting discussion of GOATs and WOATs, Ws and Ls. Taste is just opinion in practice, to be sure; but underneath all the lip-flapping and one-upping, there’s gotta be more than hot air. In an era where fact checkers can veto the bullshit with a quick Google search, you can’t just speak on how you feel – you have to back it up.
Numbers don’t lie, and Liban Ali Yusuf’s got the data. As a graduate student studying computational linguistics, the 24-year-old Waterloo, Canada native runs the website RapMetrics, a statistical take on hip-hop’s most persistent debates. It turns out that certain lyrical factors are measurable – so does that mean certain criticisms are incontrovertible? RESPECT. sat down with hip-hop’s Malcolm Gladwell to find out what linguistics can teach us about rap, and what that says about Drake’s self-obsession.
How did [RapMetrics] start?
It was mostly because I followed sports a lot, and I follow sports statistics a lot. There’s a certain way that people argue their points, sports-wise, but that doesn’t really exist with music, and it was always weird to me. I thought a lot of people said a lot of things that were very unsubstantiated, and if you do actual research into it, you find that it’s usually not right.
What do you see as the difference between hip-hop and sports in that respect?
Even when you watch [sports] on TV, you’re being inundated with numbers and stats and everything. The fans in sports have a better general understanding of simple statistics and what each statistic means, to what a player does and what a player can’t do.
Isn’t the problem that sports are something physical – we can count baskets and we can measure your RBI – but with music, it’s art; the point of art is it’s absurd, it’s ambiguous, it’s not quantifiable?
Sure, but again, it’s not so much that with the RapMetrics project I’m trying to say someone is better than someone or someone is worse than someone. It’s more trying to get an understanding of what somebody does. I think my favorite example is the simple Canibus one. Would you call yourself a fan?
No, I would not call myself a fan.
A lot of people would say he uses too big of words. And that’s a thing that’s quantifiable, something like syllables-per-word. You can look at how big the syllables-per-word are in general for a general rapper, and look at Canibus. For general rappers, it’s between 1.25 and 1.3, and then guys like Canibus and Immortal Technique, these guys are in the 1.4, 1.5 range. So you have a hard statistic that you can point to and show that the complaints that people have with how these guys rap, it’s based in something quantifiable.
So that wordy, pedantic, holier-than-thou kind of rap – you’re saying that we can quantify that? We can see just how much of a smug asshole a rapper is being?
The smug asshole index. The longer the words are, the more people are like, “Clearly this guy could say what he’s saying in a much simpler way, and he’s just being an asshole about it.” The guys who rhyme too much, those are the guys who actually really bothered me and inspired me to make my rhyming dictionary. These guys, they become good rhymers or whatever, but they derive their whole value as artists from their ability to rhyme. With the RapMetrics project, the rhyme dictionary, I got all these three-, four-syllable rhymes that to me matches up pretty well with what any of these guys do – these hardcore rhymer guys who’ve been doing it their whole lives. It’s just cool to show that rhyming is technical but it’s reproducible. The artistry should not come just from being a hardcore rhymer; it should come from somewhere else. And those are the guys I really dislike, and I really wanted to show that what they do is not that impressive. It can be done on a computer, and if it can be done on a computer, that should put pressure on them to be better than they are.
But [artistry is] unmeasurable, isn’t it?
Yeah, of course. I would never say this guy is a better artist than this guy because of this number. I think that’s ridiculous. I just like the whole idea of being able to have numbers inform your opinions. There’s a Pitchfork article I read about Drake’s first album. They were talking about how he’s really into himself and whatever. It’s a fair critique, I guess. They were talking about how often he used the pronoun ‘I’ – like 400 times in the album. On top of that, there’s research out there that says the more often someone uses words like ‘I’ in their writing – or ‘me,’ ‘myself’ – it points towards them being narcissistic. It’s controversial research. I don’t know if I believe it or not; it might be bullshit. I just had to apply it to rap, and you find that Drake has the highest ratio of ‘I’ compared to ‘them,’ ‘they,’ ‘her,’ ‘us,’ whatever.
So Drake’s ‘I’ to ‘you’ ratio was off the charts.
It was the highest. And another funny thing is Nas was actually super high on Illmatic, and then he started dipping down – which goes with the theory that says Illmatic is his best album because he’s telling personal stories, and as he started going towards concept albums, he lost some of the reason why people liked him so much. I’m not a huge Nas fan, but it seems reasonable.
Let’s talk about some of your other findings. Going back to the whole Canibus thing, let’s take it to the opposite end of the spectrum and talk about the swag, the weirdo rap: somebody like Lil B, who people say don’t even rhyme, or Riff Raff, who’s in it for the attention – it’s not about the rap, it’s about the image. What kind of numbers can we come up with to really show what they’re doing linguistically?
The swag rap is interesting because it follows the people’s perceptions for the most part; it has really low syllable-per-word. And there’s another metric: you check how many unique words there are generally – how many words there are total, and how many of them are unique. If you’re using the same words more often, people are gonna associate that with simplicity, even though I don’t agree with that necessarily, but it’s how people perceive things. The swag rap, Lil B or whatever, he’s really low syllables-per-word and really low unique word percentage. So that fits; the perception and the stats go together. I think Riff Raff is actually a good internal rhymer.
There’s a lot of interesting stuff about your site, like how you break down race linguistically, which has always been a controversial issue. What have you been able to find in the subject of race linguistically, in hip-hop?
First of all, it works as a classification tool. I developed a little thing: if you input lyrics, you can do a pretty good job of understanding if it’s a white artist or if it’s a black artist.
G-dropping is a sociolinguistic technique. What they found is when you say a word that has the suffix ‘-ing’ – so ‘running,’ ‘speaking,’ or anything like that – people of a lower social class will drop their gs more often – so they’ll say ‘runnin’’ or ‘speakin’.’ What I found is from the rap corpus, the black artists drop their gs at a much higher rate than the white artists.
If you just looked at a verse – a nameless, untitled verse – could you tell me if that rapper was likely to be white or black?
Oh yeah, pretty well. Most likely I could tell you with 70-, 80-percent accuracy. It’s how text classification works. You know how your email does a pretty good job of separating your spam and your real email? How that works is it’s simple text classification. It’s not that complicated. Let’s say you have 10,000 emails of hard data, and you know 5,000 are spam and you know 5,000 are good, real emails. You find certain words are more likely to occur in spam email than they are in real email and vise versa. Words like ‘Viagra’ – if that’s included in the email, it’s much more likely to be a spam email than a real email. Or if the word is ‘free’ in capital letters with an exclamation mark.
So you do the same thing with an artist. I think I did 2,000 songs on each side of white and black, and then you find that certain words, of course, are more likely to occur. For example, the word ‘nigger’ – I’m a black guy, it’s okay, I can say it – that’s more likely to occur on the black side. Same with the word ‘Impala,’ ‘Houston,’ ‘Henny’ – a lot of alcohols too – ‘Rose,’ ‘Moet’; and then on the white side, it’s these bigger words, no g-dropping generally. ‘Intestine,’ I think, was one.
I saw you break down Mac Miller’s video, which was fuckin’ hilarious, ‘cause there’s other ways [to numerically analyze race] besides just looking at lyrics. You’ve used stuff like YouTube comments.
YouTube comments are really good.
You think they’re accurate samples of the average American music listener?
Sure. The general thing that people say about YouTube comments is, “Oh, those people are retarded. Those people are stupid.” But these are people who are not checking their opinions at the door. It’s what they feel at the exact moment – bam, it’s out there. And I like that. I just like good clean data, and that’s good clean data. I remember I did the one for when you look at misspelled words. That ends up being a really good indicator to separate the super street guys from the super backpackery guys. I know it’s kind of reductive, I get all that, but it’s a good dirty tool to separate artists on extreme ends, which I think is valuable.
I think you found out that Immortal Technique fans like to get in arguments over political –
Yeah, that was the political one. That’s simple too. I read a paper – this was during the 2004 elections – they got a bunch of blogs, all these conservative and Democratic blogs, so what they found was…Democrats are a lot more prone to reply to a Republican than a fellow Democrat, and vise versa.
So how did you apply that to the YouTube comments?
If you look at these YouTube videos, and you find out how many of the posts are replies and how many are just posts by themselves, you find that the top guys in that metric – the guys whose fans are replying and talking to each other – they’re the guys who inspire all this crazy debate, like Jedi Mind Tricks and Immortal Technique. MF Doom, though, MF Doom didn’t make sense to me, but I guess it’s people saying, “This is real hip-hop,” and other people saying whatever.
For music criticism – we were talking about Pitchfork earlier – what does RapMetrics give critics? How can it be a tool for them to deploy?
I think it’s good because you get some sort of historical context on things. I just don’t like when people say, “This is the worst ever,” “This is the best ever,” or “This is the first time this happened,” blah, blah, blah. There’s data out there, and there are a few guys who are legit rap historians, who are really good at checking their opinions, but for the most part, I think people need to do a better job of checking their opinions at the door, taking a breather and figuring things out, and then making the articles. There aren’t that many legit rap critics that inspire me to do good work.
Some guy named Brandon Soderburg, he wrote a scathing article about Lupe’s new video, “Bitch Bad.” The guy, he’s mad about Lupe because Lupe was writing a song about women issues, but then if you go back and you look at Lupe’s history and how he talks about women, compared to most rappers, the guy is ridiculous. He barely uses words like ‘bitch’ or ‘ho’ or says “Suck my dick” or any of that kind of shit. He used the word ‘bitch’ like two times, and only one time where he’s actually talking about a woman directly. I think if you look at him and how he talks about women in general, I think he has a right to make any kind of song like that, because historically, he’s been really good at not talking about women in a negative way. I think critics should do a better job of having good hard data to make their points.
Do you consider yourself a critic?
Yes, I would consider myself a critic, I guess. I do think criticism is a legit field, and I think it matters, but I don’t know if I have the quality training. I’m not a historian by any measure; I probably couldn’t tell you shit about most ‘80s rap. I think that’s necessary to be a real critic. So I don’t know.
Does RapMetrics take away a little bit of the magic?
Well, fuck you.
Sorry. But I think if you can love something, you can analyze it, and if you still come away and you love it, then that means you loved it from the very beginning. And if you can analyze something and you don’t love it afterwards, then maybe the emotion you had initially wasn’t all there.
What about RapGenius?
Those guys, they seem like cool dudes, but I think their currency is not in legit criticism. Their currency is in getting rappers to love them. There’s that Will Staley piece that was really good about RapGenius. I thought he did a good job. He made the argument that not every lyric needs an explanation, and at a certain point it becomes pedantic to need to explain it like that. And also, there’s no other medium that would need that; a novel’s not gonna have a line-by-line explanation.
You don’t know about No Fear Shakespeare? Line for line, they translate Shakespeare from Old English to modern English.
Hell yeah, how you think I graduated high school?
That’s an academic endeavor. I understand that tool. But I think a lot of people, if you’re trying to explain line for line, if that’s coming from the emotion that rap needs that, you can’t just… It’s tough, man, I don’t know. I’m guilty of the same thing at the same time, so I can’t really talk shit. I do that kind of analysis too. So I don’t know.
And Will Staley did give you a nice compliment in that piece, too.
Yeah, ‘cause my currency is hard work and hard criticism. And [RapGenius’s] currency is getting retweets from T-Pain.
You do need to get your followers up though. I saw @RapMetrics has like 40 followers.
Social media, I have no idea how to work it.
It makes no sense to you, but you can mine it for analysis and data.
I can mine it for analysis, I can be detached from the whole thing, but I can’t play the game.
What [variables] are you a sucker for? What RapMetrics parameters do you find yourself leaning one way or another?
I like when the rhyme density is good, and I like when the syllables-per-word are low.
That’s the Kanye, the Mase theory.
Cam’ron and Fabolous too. Fabolous has the biggest five-syllable – he’s the biggest rhymer – five-syllable rhymes where almost every syllable is stressed. He had that one line like, “My guns go click and spark, / Something, something, 106 and Park.”
You mentioned that you’re compiling an e-book to present RapMetrics in a different format. Is that still on your agenda?
Yeah, that would be a good thing for me to do, just to get all the ideas down and make it fun and accessible as a read.
Like a hip-hop Malcolm Gladwell.
Yes. Even though Malcolm Gladwell’s kind of an asshole himself. I love Malcolm Gladwell. Right beside my bed right now is Blink; I was reading it last night. I like how he writes and how he tells his stories. There’s so many academic people who are doing awesome research, and just because of who they are as people, their research will never get out to people. And so you need people in the middle like this to explain what awesome projects are out there.
That’s why Bill Nye and Neil deGrasse Tyson exist.