Improving Part-of-Speech Tagging of Historical Text by First Translating to Modern Text - Computational History and Data-Driven Humanities
Conference Papers Year : 2016

Improving Part-of-Speech Tagging of Historical Text by First Translating to Modern Text

Erik Sang
  • Function : Author
  • PersonId : 1020922

Abstract

We explore the task of automatically assigning syntactic tags (known as part-of-speech tags) like Noun and Verb to words in seventeenth-century Dutch text. Tools exist for performing this task for modern texts but they perform poorly on historical texts because of language changes. We test several methods for translating the words in the historical text to modern equivalents before applying the tag assignment tools. We show that this additional translation step improves the quality of the automatic syntactic analysis. Further improvements are possible when the lexicons and text collections used for developing the translation process, are extended in size.
Fichier principal
Vignette du fichier
431566_1_En_6_Chapter.pdf (189.15 Ko) Télécharger le fichier
Origin Files produced by the author(s)
Loading...

Dates and versions

hal-01616302 , version 1 (13-10-2017)

Licence

Identifiers

Cite

Erik Sang. Improving Part-of-Speech Tagging of Historical Text by First Translating to Modern Text. 2nd International Workshop on Computational History and Data-Driven Humanities (CHDDH), May 2016, Dublin, Ireland. pp.54-64, ⟨10.1007/978-3-319-46224-0_6⟩. ⟨hal-01616302⟩
70 View
139 Download

Altmetric

Share

More