Long-Text-to-Video-GAN - IFIP Open Digital Library Access content directly
Conference Papers Year : 2022

Long-Text-to-Video-GAN

Ayman Talkani
  • Function : Author
  • PersonId : 1334370

Abstract

While there have been several works regarding the task of video generation from short text [1–3], which tend to focus more on the continuity of the generated images or frames, there has been very little attention drawn towards the task of story visualization [4], which attempts to generate dynamic scenes and characters described in a large amount of detail in a multi-para input text. We therefore propose our own novel take on this task which attempts to compile these dynamic scenes into a larger video, while also improving the scores of the current state of the art models in story visualization and video generation respectively. We intend to do this by making use of semantic disentangling connections [5] in between our generators in order to maintain global consistency between consecutive images, as well as ensuring similarity between the video re-description and the input text, thus leading to a higher image quality. Once these images are generated, we make use of a depth-aware video interpolation framework [6] in order to generate the remaining non-existing frames of the video in between the generated images. We then evaluate our model on the CLEVR-SV and Pororo-SV datasets for the story visualization task, and the UCF-101 dataset to measure the accuracy of the video generated. This way, we intend to outperform existing state-of-the-art models significantly.
Embargoed file
Embargoed file
0 7 23
Year Month Jours
Avant la publication
Wednesday, January 1, 2025
Embargoed file
Wednesday, January 1, 2025
Please log in to request access to the document

Dates and versions

hal-04388163 , version 1 (11-01-2024)

Licence

Attribution

Identifiers

Cite

Ayman Talkani, Anand Bhojan. Long-Text-to-Video-GAN. 6th International Conference on Computer, Communication, and Signal Processing (ICCCSP), Feb 2022, Chennai, India. pp.90-97, ⟨10.1007/978-3-031-11633-9_8⟩. ⟨hal-04388163⟩
8 View
1 Download

Altmetric

Share

Gmail Facebook X LinkedIn More