A Comparative Assessment of State-Of-The-Art Methods for Multilingual Unsupervised Keyphrase Extraction

Nikolaos Giarelis; Nikos Kanakaris; Nikos Karacapilidis

doi:10.1007/978-3-030-79150-6_50

Conference Papers Year : 2021

A Comparative Assessment of State-Of-The-Art Methods for Multilingual Unsupervised Keyphrase Extraction

(1) , (1) , (1)

Nikolaos Giarelis

Function : Author
PersonId : 1105446

University of Patras

Nikos Kanakaris

Function : Author
PersonId : 1105447

University of Patras

Nikos Karacapilidis

Function : Author
PersonId : 1033582

University of Patras

Abstract

Keyphrase extraction is a fundamental task in information management, which is often used as a preliminary step in various information retrieval and natural language processing tasks. The main contribution of this paper lies in providing a comparative assessment of prominent multilingual unsupervised keyphrase extraction methods that build on statistical (RAKE, YAKE), graph-based (TextRank, SingleRank) and deep learning (KeyBERT) methods. For the experimentations reported in this paper, we employ well-known datasets designed for keyphrase extraction from five different natural languages (English, French, Spanish, Portuguese and Polish). We use the F1 score and a partial match evaluation framework, aiming to investigate whether the number of terms of the documents and the language of each dataset affect the accuracy of the selected methods. Our experimental results reveal a set of insights about the suitability of the selected methods in texts of different sizes, as well as the performance of these methods in datasets of different languages.

Keywords

Domains

Computer Science [cs]

Fichier principal

509922_1_En_50_Chapter.pdf (268.81 Ko)

Origin	Files produced by the author(s)
licence	CC BY 4.0 - Attribution

Connect in order to contact the contributor

https://inria.hal.science/hal-03287681

Submitted on : Thursday, July 15, 2021-6:10:48 PM

Last modification on : Wednesday, February 15, 2023-4:16:04 AM

Long-term archiving on : Saturday, October 16, 2021-7:06:20 PM

Dates and versions

hal-03287681 , version 1 (15-07-2021)

Licence

CC BY 4.0 - Attribution

Identifiers

HAL Id : hal-03287681 , version 1
DOI : 10.1007/978-3-030-79150-6_50

Cite

Nikolaos Giarelis, Nikos Kanakaris, Nikos Karacapilidis. A Comparative Assessment of State-Of-The-Art Methods for Multilingual Unsupervised Keyphrase Extraction. 17th IFIP International Conference on Artificial Intelligence Applications and Innovations (AIAI), Jun 2021, Hersonissos, Crete, Greece. pp.635-645, ⟨10.1007/978-3-030-79150-6_50⟩. ⟨hal-03287681⟩

A Comparative Assessment of State-Of-The-Art Methods for Multilingual Unsupervised Keyphrase Extraction

Abstract

Keywords

Domains

Dates and versions

Licence

Identifiers

Cite

Export

Collections

Altmetric

Share