Captioning: Difference between revisions

From Civitai Wiki
Jump to navigation Jump to search
No edit summary
(Expanded the description and added links)
 
Line 1: Line 1:
The process of describing input [[Training Data|training images]] to help [[Stable Diffusion]] understand what it's looking at. Captioning can be done by hand, or via a number of Caption creation tools.
The process of describing input [[Training Data|training images]] to help [[Stable Diffusion]] understand what it's looking at.
 
Captioning can be done by hand, or via a number of Caption creation tools.
 
There are two major methods of captioning: [[Tagging]] based, and [[Natural Language Processing]].
 
'''Tagging:''' A sequence of one or two word tags describing the various elements of the image. Especially popular on models trained using data from image boards, where the content is already tagged when uploaded.
 
'''Natural Language:''' Describing the image using simple sentences, either manually or through algorithms such as [[BLIP]].
[[Category:Training]]
[[Category:Training]]

Latest revision as of 00:32, 25 June 2024

The process of describing input training images to help Stable Diffusion understand what it's looking at.

Captioning can be done by hand, or via a number of Caption creation tools.

There are two major methods of captioning: Tagging based, and Natural Language Processing.

Tagging: A sequence of one or two word tags describing the various elements of the image. Especially popular on models trained using data from image boards, where the content is already tagged when uploaded.

Natural Language: Describing the image using simple sentences, either manually or through algorithms such as BLIP.