A General Approach to Extracting Full Names and Abbreviations for Chinese Entities from the Web - Intelligent Information Processing V
Conference Papers Year : 2010

A General Approach to Extracting Full Names and Abbreviations for Chinese Entities from the Web

Abstract

Identifying Full names/abbreviations for entities is a challenging problem in many applications, e.g. question answering and information retrieval. In this paper, we propose a general extraction method of extracting full names/abbreviations from Chinese Web corpora. For a given entity, we construct forward and backward query items and commit them to a search engine (e.g. Google), and utilize search results to extract full names and abbreviations for the entity. To verify the results, filtering and marking methods are used to sort all the results. Experiments show that our method achieves precision of 84.7% for abbreviations, and 77.0% for full names.
Fichier principal
Vignette du fichier
JiangCYLW10.pdf (249.91 Ko) Télécharger le fichier
Origin Files produced by the author(s)
Loading...

Dates and versions

hal-01060363 , version 1 (21-11-2017)

Licence

Identifiers

Cite

Guang Jiang, Cao Cungen, Sui Yuefei, Han Lu, Shi Wang. A General Approach to Extracting Full Names and Abbreviations for Chinese Entities from the Web. 6th IFIP TC 12 International Conference on Intelligent Information Processing (IIP), Oct 2010, Manchester, United Kingdom. pp.271-280, ⟨10.1007/978-3-642-16327-2_33⟩. ⟨hal-01060363⟩
111 View
74 Download

Altmetric

Share

More