Interpretable Topic Extraction and Word Embedding Learning Using Row-Stochastic DEDICOM - Machine Learning and Knowledge Extraction
Conference Papers Year : 2020

Interpretable Topic Extraction and Word Embedding Learning Using Row-Stochastic DEDICOM

Abstract

The DEDICOM algorithm provides a uniquely interpretable matrix factorization method for symmetric and asymmetric square matrices. We employ a new row-stochastic variation of DEDICOM on the pointwise mutual information matrices of text corpora to identify latent topic clusters within the vocabulary and simultaneously learn interpretable word embeddings. We introduce a method to efficiently train a constrained DEDICOM algorithm and a qualitative evaluation of its topic modeling and word embedding performance.
Fichier principal
Vignette du fichier
497121_1_En_22_Chapter.pdf (3.5 Mo) Télécharger le fichier
Origin Files produced by the author(s)

Dates and versions

hal-03414746 , version 1 (04-11-2021)

Licence

Identifiers

Cite

Lars Hillebrand, David Biesner, Christian Bauckhage, Rafet Sifa. Interpretable Topic Extraction and Word Embedding Learning Using Row-Stochastic DEDICOM. 4th International Cross-Domain Conference for Machine Learning and Knowledge Extraction (CD-MAKE), Aug 2020, Dublin, Ireland. pp.401-422, ⟨10.1007/978-3-030-57321-8_22⟩. ⟨hal-03414746⟩
131 View
53 Download

Altmetric

Share

More