Generative Image-to-Text Transformer

From Civitai Wiki
Jump to navigation Jump to search

Generative Image-to-Text (GIT), first discussed in this paper, was trained on 20 million image-text pairs, and further fine-tuned on TextCaps. A robust image-to-text processor.


External Links

Please note that the content of external links are not endorsed or verified by us and can change with no notice. Use at your own risk.

https://arxiv.org/abs/2205.14100