Largest Vocabulary in Rock

ledzeppelinYesterday I saw “The Largest Vocabulary in Hip-Hop” and thought it’d be cool to do the same thing for classic rock artists. I tried to collect 35,0000 words ( same as the Hip-Hop article), but had to limit it to 17,000 (taken randomly) because I couldn’t get more lyrics for Led Zeppelin–they only have ~70 short songs on A-Z lyrics.  Without further ado:

Rock_VocabThe Y axis has each artist’s average words per song, and the x axis has the # of unique words used.  My first thought: How did Kiss manage to use so few words? When I was learning Spanish last year, I read you need to know ~ 2000 words just to converse on a basic level (about where the average rock band lands) and Kiss used less than 1,400! This makes sense to some degree–Kiss has a heavy metal style that focuses on instrumentals and raw energy. Whether you think Kiss’ lyrics are inarticulate or pithy, it’s impressive either way.

On the other end of the spectrum is Pink Floyd, they use the most unique words, which reflects their descriptive, dark lyrics. Unfortunately I could only get ~19,000 words for Pink Floyd, so it’s difficult to see where they would fit in with the rappers. We can extrapolate the percentage difference between unique words at 17,000 and 35,000 using Metallica as a model. The following graph shows the relationship between unique words and total words for Metallica:

murder

Metallica increased their unique words from ~2,250 at 17,000 to ~3,500 at 34,990 (55%). If we compare Metallica to rappers, it places them somewhere in the 10% (just behind Drake). Metallica is a pretty typical band from this sample–the other artists would probably fall somewhere similar. The extrapolated percentage increase for Pink Floyd is ~4120, which is in the ~25% of rappers (just ahead of T.I.).

Key take away–classic rock artists generally do not use a robust vocabulary in their lyrics; they instead rely on awesome instrumentals, vocals, and energy. In my next blog post I’ll analyze in depth each artist’s lyrics!

28 thoughts on “Largest Vocabulary in Rock

  1. With a poet, Robert Hunter, writing many of their lyrics, I would bet that the Grateful Dead also are pretty high in unique word count.

    Like

  2. […] Rock had the widest vocabulary of any rapper. Now, a new researcher named Brian Chesley has performed a similar experiment, only with rock […]

    Like

  3. […] Rock had the widest vocabulary of any rapper. Now, a new researcher named Brian Chesley has performed a similar experiment, only with rock […]

    Like

  4. Seriously, would really like to see where RUSH would be on this chart. Can’t imagine they wouldn’t be both high in avg. words per song and unique words per song. With a literary, poetic genius like Neil Peart, it would be extremely surprising if they weren’t.

    Like

  5. […] Brian Chesley zainspirowany artykułem o najszerszym zasobie słównictwa w hip-hopie postanowił zbadać, jak sytuacja wygląda w świecie rock and rolla. Jak pokazują badania, przeciętnie potrzebujemy ok. 2000 wyrazów, aby móc sprawie komunikować się z innymi. Zebrano 17 000 (wybranych losowo) i okazało się, że wśród tej liczby Metallica w swoich tekstach używa około 2,250 unikalnych słów. Natomiast jeśli weźmiemy większą liczbę słów – 34,990 to liczba wyjątkowych słów wzrośnie do 3,500. Jest to wynik ponadprzeciętny w porównaniu do innych zespół rockowych, jak np. Bon Jovi – ok. 1800 słów, czy Kiss – ok. 1,400 unikalnych słów. Jeśli wynik ten porównać by ze wcześniejszym badaniem wśród wykonawców hip-hopowych to wynik Metalliki spadłby do 10% (zaraz za reperem Drake). W badaniu wyjątkowo dobrze wypadł Pink Floyd – ok. 2600. Poniżej wykres dla poszczególnych zespołów. […]

    Like

  6. Pasting in all 45000 or so KISS of the words from the KISS lyrics section on AZlyrics.com gives a unique word count of over 6000 now.

    Brian I can send you the file of all the KISS lyrics if you like, I’m VERY curious about how you are getting the unique word count total. Just from my own lifetime of listening a lot to KISS and AC/DC I’m very surprised to see AC/DC have a higher count.

    Like

    • Peter:

      Thanks for checking the data! A few things to consider:

      1. I limited the sample to 17,000 so I could compare all artists
      2. I made all words lowercase
      3. I removed special characters

      I looked at the website you use, and it doesn’t do a good job of word tokenization, especially when you paste in text (e.g. it thinks “i” is different from “I”, it combines words separated by stanzas to make a unique word, etc).

      I’m planning on releasing my source code next week!

      Like

      • Oh that website flatout sucks but it’s the only thing I could find that would do this task. I’m not a programmer so it’s not like I could make something. I think it would be very interesting to put the full load of lyrics by KISS through your program though.

        Like

  7. Just because he says normal words and then adds ‘yah!’ or ‘ah!’ or ‘ee-yah!’ or ‘haaay!’ or ‘oooh!’ at the end of them really fast – doesn’t make them unique words. Jus’ sayin’.

    Like

Leave a comment