Combining Statistical Information and Semantic Similarity for Short Text Feature Extension
Abstract
A short text feature extension method combining statistical information and semantic similarity is proposed,Firstly, After defining the contribution of word, mutual information, an associated word-pairs set is generated by comparing the value of mutual information with threshold, then it is taken as the query words set to search for HowNet. For each word-pairs, senses are found in knowledge base HowNet, and semantic similarity of query word-pairs are calculated. Common sememe satisfied condition is added into the original term vector as extended feature, otherwise, semantic relationship is computed and the corresponding sememe is expanded into feature set. The above process is repeated, an extended feature set is finally obtained. Experimental results show the effectiveness of our method.
Origin | Files produced by the author(s) |
---|
Loading...