Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation (BLIP), created by Salesforce, is a solid, simple image-to-text processor. Used to caption images for training.

Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Navigation menu