An Efficient Microblog Hot Topic Detection Algorithm Based on Two Stage Clustering
Abstract
Microblog has the characteristic of short length, complex structure and words deformation. In this paper, a two stage clustering algorithm based on probabilistic latent semantic analysis (pLSA) and K-means clustering (K-means) is proposed. Besides, this paper also presents the definition of popularity and mechanism of sorting the topics. Experiments show that our method can effectively cluster topics and be applied to microblog hot topic detection.
Origin | Files produced by the author(s) |
---|
Loading...