Versification and authorship attribution. A pilot study on Czech, German, Spanish, and English poetry
Keywords:authorship attribution, stylometry, versification, Czech verse, German verse, Spanish verse, English verse
This article describes pilot experiments performed as one part of a longterm project examining the possibilities for using versification analysis to determine the authorships of poetic texts. Since we are addressing this article to both stylometry experts and experts in the study of verse, we first introduce in detail the common classifiers used in contemporary stylometry (Burrows’ Delta, Argamon’s Quadratic Delta, Smith-Aldridge’s Cosine Delta, and the Support Vector Machine) and explain how they work via graphic examples. We then provide an evaluation of these classifiers’ performance when used with the versification features found in Czech, German, Spanish, and English poetry. We conclude that versification is a reasonable stylometric marker, the strength of which is comparable to the other markers traditionally used in stylometry (such as the frequencies of the most frequent words and the frequencies of the most frequent character n-grams).