Clip Skip: Difference between revisions
Jump to navigation
Jump to search
(Created page with "The Text Encoder uses a mechanism called "CLIP", made up of 12 layers (corresponding to the 12 layers of the Stable Diffusion neural network). Clip Skip specified the layer number Xth from the end. Clip Skip of 2 will send the penultimate layer's output vector to the Attention block. Unless the base model you're training against was trained (or Mixed) with Clip Skip 2, you can use 1. SDXL does not benefit from Clip Skip 2.") |
(No difference)
|
Latest revision as of 07:19, 11 October 2023
The Text Encoder uses a mechanism called "CLIP", made up of 12 layers (corresponding to the 12 layers of the Stable Diffusion neural network). Clip Skip specified the layer number Xth from the end. Clip Skip of 2 will send the penultimate layer's output vector to the Attention block. Unless the base model you're training against was trained (or Mixed) with Clip Skip 2, you can use 1. SDXL does not benefit from Clip Skip 2.