Any geeks out there?
Adam (serving as composer & producer for Uncarnate) is a data scientist and Natural Language Processing enthusiast in his professional life, which inspired him to make a simple analysis of Wojtek’s lyrics. The question was:
Which words are most characteristic of Uncarnate lyrics? Which words stand out most visibly when compared to “standard English”?
Perhaps this question would be best answered by a linguist. In the absence of one, we resorted to simple Natural Language Processing and basic text mining / data analysis techniques. Armed with Python, Spacy NLP library and a Jupyter notebook there was no mystery left. Below you can see most characteristic words from Wojtek’s writings. The bigger a word, the more it departs from common English vocabulary.
Here is the full record of the exercise: https://github.com/adam-ra/text-mining-exercises/blob/master/Uncarnate_lyrics_analysis.ipynb