Generative Image-to-Text Transformer

From Civitai Wiki
Revision as of 13:59, 2 February 2024 by MajMorse (talk | contribs) (added external link disclaimer)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Generative Image-to-Text (GIT), first discussed in this paper, was trained on 20 million image-text pairs, and further fine-tuned on TextCaps. A robust image-to-text processor.


External Links

Please note that the content of external links are not endorsed or verified by us and can change with no notice. Use at your own risk.

https://arxiv.org/abs/2205.14100