Automatic Poetic Metre Detection for Czech Verse

Authors

DOI:

https://doi.org/10.12697/smp.2024.11.1.02

Keywords:

metrical analysis, metre detection, poetry, Czech accentual-syllabic verse, Corpus of Czech Verse, KVĚTA, BiLSTM-CRF, Word2Vec

Abstract

Metrical analysis of verse is an essential and challenging task in the research on versification consisting of analysing a poem and deciding which metre it is written in. Thanks to existing corpora, we can take advantage of data-driven approaches, which can be better suited to the specific versification problems at hand than rulebased systems.

This work analyses the Czech accentual-syllabic verse and automatic metre assignment using the vast and annotated Corpus of Czech Verse. We define the problem as a sequence tagging task and approach it using a machine learning model and many different input data configurations. In comparison to this approach, we reimplement the existing data-driven system KVĚTA.

Our results demonstrate that the bidirectional LSTM-CRF sequence tagging model, enhanced with syllable embeddings, significantly outperforms the existing KVĚTA system, with predictions achieving 99.61% syllable accuracy, 98.86% line accuracy, and 90.40% poem accuracy. The model also achieved competitive results with token embeddings. One of the most interesting findings is that the best results are obtained by inputting sequences representing whole poems instead of individual poem lines.

Downloads

Download data is not yet available.

Downloads

Published

2024-08-26

Issue

Section

Articles