Image Caption Combined with GAN Training Method

Zeqin Huang; Zhongzhi Shi

doi:10.1007/978-3-030-46931-3_29

Conference Papers Year : 2020

Image Caption Combined with GAN Training Method

(1, 2) , (2)

1
2

Zeqin Huang

Function : Author
PersonId : 1118842

University of Chinese Academy of Sciences [Beijing]

Chinese Academy of Sciences [Beijing]

Zhongzhi Shi

Function : Author
PersonId : 1046597

Chinese Academy of Sciences [Beijing]

Abstract

In today’s world where the number of images is huge and people cannot quickly retrieve the information they need, we urgently need a simpler and more human-friendly way of understanding images, and image captions have emerged. Image caption, as its name suggests, is to analyze and understand image information to generate natural language descriptions of specific images. In recent years, it has been widely used in image-text crossover studies, early infant education, and assisted by disadvantaged groups. And the favor of industry, has produced many excellent research results. At present, the evaluation of image caption is basically based on objective evaluation indicators such as BLUE and CIDEr. It is easy to prevent the generated caption from approaching human language expression. The introduction of GAN idea allows us to use a new method of adversarial training. To evaluate the generated caption, the evaluation module is more natural and comprehensive. Considering the requirements for image fidelity, this topic proposes a GAN-based image description. The Attention mechanism is introduced to improve image fidelity, which makes the generated caption more accurate and more close to human language expression.

Keywords

GAN Deep learning Attention mechanism Image caption LSTM

Domains

Computer Science [cs]

Fichier principal

498234_1_En_29_Chapter.pdf (377.32 Ko)

Origin	Files produced by the author(s)

Hal Ifip : Connect in order to contact the contributor

https://inria.hal.science/hal-03456989

Submitted on : Tuesday, November 30, 2021-12:34:25 PM

Last modification on : Tuesday, November 12, 2024-11:34:03 AM

Long-term archiving on : Tuesday, March 1, 2022-7:00:29 PM

Dates and versions

hal-03456989 , version 1 (30-11-2021)

Licence

Attribution

Identifiers

HAL Id : hal-03456989 , version 1
DOI : 10.1007/978-3-030-46931-3_29

Cite

Zeqin Huang, Zhongzhi Shi. Image Caption Combined with GAN Training Method. 11th International Conference on Intelligent Information Processing (IIP), Jul 2020, Hangzhou, China. pp.310-316, ⟨10.1007/978-3-030-46931-3_29⟩. ⟨hal-03456989⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IFIP IFIP-AICT IFIP-TC IFIP-TC12 IFIP-AICT-581

63 View

69 Download

Image Caption Combined with GAN Training Method

Abstract

Keywords

Domains

Dates and versions

Licence

Identifiers

Cite

Export

Collections

Altmetric

Share