"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

October 11, 2023

Vision Latest Notes - Text-to-Image Generation

Vision Latest Notes - Text-to-Image Generation

Alignment in Text-to-Image Generation

Key components

  • Controllable Generation
  • Editing
  • Better following prompts
  • Customization

Techniques

  • GAN
  • Auto-Regressive
  • Diffusion
  • Non AR Transformer

One Liners

  • GAN - Learn to Fake it until it becomes Real
  • AR - Image to patches, Patches mapped to indexes - Tokens, Prediction - Set of tokens one by one
  • Non-AR Transformer - Schedule policy to generate tokens at Each step
  • Diffusion - Random noise in each step, Subtract noise to end up with the required semantic quality

Diffusion Overview




GLIDE

DALL-E-2



Image Super-Resolution via Iterative Refinement

Keep Exploring!!!

No comments: