Data science & Uncarnate lyrics

Any geeks out there?

Adam (serving as composer & producer for Uncarnate) is a data scientist and Natural Language Processing enthusiast in his professional life, which inspired him to make a simple analysis of Wojtek’s lyrics. The question was:

Which words are most characteristic of Uncarnate lyrics? Which words stand out most visibly when compared to “standard English”?

Perhaps this question would be best answered by a linguist. In the absence of one, we resorted to simple Natural Language Processing and basic text mining / data analysis techniques. Armed with Python, Spacy NLP library and a Jupyter notebook there was no mystery left. Below you can see most characteristic words from Wojtek’s writings. The bigger a word, the more it departs from common English vocabulary.

uncarnate-protein-lyrics-wordcloud

Here is the full record of the exercise: https://github.com/adam-ra/text-mining-exercises/blob/master/Uncarnate_lyrics_analysis.ipynb

0 comments on “Data science & Uncarnate lyricsAdd yours →

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>