Captioning

From Civitai Wiki
Revision as of 00:32, 25 June 2024 by Aishavingfun (talk | contribs) (Expanded the description and added links)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The process of describing input training images to help Stable Diffusion understand what it's looking at.

Captioning can be done by hand, or via a number of Caption creation tools.

There are two major methods of captioning: Tagging based, and Natural Language Processing.

Tagging: A sequence of one or two word tags describing the various elements of the image. Especially popular on models trained using data from image boards, where the content is already tagged when uploaded.

Natural Language: Describing the image using simple sentences, either manually or through algorithms such as BLIP.