"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

December 28, 2023

Stable Diffusion - Basics

Dataset - LAION-5B (5 billion text-image pairs) 

Dataset from - Pinterest and DeviantArt, e-commerce services like Shopify, cloud services like Amazon Web Services, thumbnails from YouTube, and images from news sites.

CNN vs Diffusion

  • CNN – Feature Extraction, Error calculation, Weights update
  • Diffusion – Noise Addition in the forward step, Denoising in the second step

Key Steps in Implementation 

  • Method of learning to generate new stuff - Forward/reverse diffusion
  • Way to link text and images - Text-image representation model, Word as vectors, CLIP
  • Way to compress images retain features - Autoencoder - imposes a bottleneck in the network which forces a compressed knowledge representation of the original input
  • Priors built into the algorithm, Diffusion for Images – UNet architecture  - U-net architecture + ‘attention’
  • ControlNet  - Control diffusion models by adding extra conditions, a "locked" copy, and a "trainable" weights copy

No comments: