Extraction of Web Image Information: Semantic or Visual Cues?
Abstract
Text based approaches for web image information retrieval have been exploited for many years, however the noisy textual content of the web pages makes their task challenging. Moreover, text based systems that retrieve information from textual sources such as image file names, anchor texts, existing keywords and, of course, surrounding text often share the inability to correctly assign all relevant text to an image and discard the irrelevant. A novel method for indexing web images is discussed in the present paper. The main concern of the proposed system is to overcome the obstacle of correctly assigning textual information to web images, while disregarding text that is unrelated to them. The proposed system uses visual cues in order to cluster a web page into several regions and compares this method to the use of semantic information and the realization of a k-means clustering. The evaluation reveals the advantages and disadvantages of the different clustering techniques and confirms the validity of the proposed method for web image indexing.
Domains
Computer Science [cs]Origin | Files produced by the author(s) |
---|
Loading...