Elimination Method Study of Ambiguous Words in Chinese Automatic Indexing
Abstract
Faced with huge amounts of information to realize the accurate
retrieval under the network environment, the first step is indexing
words cannot appear ambiguity word. Because Chinese’s the basic
unit is Chinese characters, Chinese characters form words, Word is
divided into monosyllabic word and compound word, and there’s no
space between Chinese keywords and there are a lot of ambiguous concept.
Therefore a lot of ambiguity in the indexing process will be produced.
The result detected information of irrelevant or mistakenly identified.
The paper focuses on a method to eliminating the crossed meanings
ambiguous words in the automatic indexing. The paper puts forward a
method to eliminating ambiguous words combined algorithm of exhaustive
method and disambiguation rules. Experiments show that it can avoid a
great lot segmenting ambiguities with better segmenting results.
Origin | Files produced by the author(s) |
---|
Loading...