Skip to Main Content (Press Enter)

Logo UNIMORE
  • ×
  • Home
  • Degree programmes
  • Modules
  • Jobs
  • People
  • Research Outputs
  • Academic units
  • Third Mission
  • Projects
  • Skills

UNI-FIND
Logo UNIMORE

|

UNI-FIND

unimore.it
  • ×
  • Home
  • Degree programmes
  • Modules
  • Jobs
  • People
  • Research Outputs
  • Academic units
  • Third Mission
  • Projects
  • Skills
  1. Research Outputs

SynthCap: Augmenting Transformers with Synthetic Data for Image Captioning

Conference Paper
Publication Date:
2023
Short description:
SynthCap: Augmenting Transformers with Synthetic Data for Image Captioning / Caffagni, Davide; Barraco, Manuele; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita. - 14233:(2023), pp. 112-123. ( 22nd International Conference on Image Analysis and Processing, ICIAP 2023 Udine, Italy September 11-15, 2023) [10.1007/978-3-031-43148-7_10].
abstract:
Image captioning is a challenging task that combines Computer Vision and Natural Language Processing to generate descriptive and accurate textual descriptions for input images. Research efforts in this field mainly focus on developing novel architectural components to extend image captioning models and using large-scale image-text datasets crawled from the web to boost final performance. In this work, we explore an alternative to web-crawled data and augment the training dataset with synthetic images generated by a latent diffusion model. In particular, we propose a simple yet effective synthetic data augmentation framework that is capable of significantly improving the quality of captions generated by a standard Transformer-based model, leading to competitive results on the COCO dataset.
Iris type:
Relazione in Atti di Convegno
Keywords:
Image Captioning; Synthetic Data; Vision-and-Language;
List of contributors:
Caffagni, Davide; Barraco, Manuele; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita
Authors of the University:
BARALDI LORENZO
CAFFAGNI DAVIDE
CORNIA MARCELLA
CUCCHIARA Rita
Handle:
https://iris.unimore.it/handle/11380/1309206
Full Text:
https://iris.unimore.it//retrieve/handle/11380/1309206/576848/2023-iciap-captioning.pdf
Book title:
Proceedings of the 22nd International Conference on Image Analysis and Processing
Published in:
LECTURE NOTES IN COMPUTER SCIENCE
Journal
LECTURE NOTES IN COMPUTER SCIENCE
Series
  • Use of cookies

Powered by VIVO | Designed by Cineca | 26.4.5.0