"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

July 05, 2020

Weekend Reads - Fashion Virtual Dress Creation - Papers

Paper #1 - M2E-Try On Net: Fashion from Model to Everyone
Key Notes
  • Pose alignment network (PAN)
  • Texture refinement network (TRN)
  • Fitting network (FTN)
Pose alignment network - (PAN) to align the model and clothes pose to the target pose. Each dense pose prediction has a partition of 24 parts

Texture refinement network (TRN) to enrich the textures and logo patterns to the desired clothes
Texture details, region of texture, binary mask with the same size. Merged images while still preserving the textual details on the garments
Fitting network (FTN) to merge the transferred garments to the target person images. Generative network to generate fashion images from textual inputs

Fitting Network is an encoder-decoder network, including three convolution layers as the encoder, six residual blocks for feature learning, followed by two deconvolution layers and one convolution layer as the decoder


Code - https://github.com/shionhonda/viton-gan

Paper #2 - VITON-GAN: Virtual Try-on Image Generator Trained with Adversarial Loss
Key Notes
  • U-net generator and thin plate spline (TPS)
  • Human parser
  • Pose estimator
GANs are able to generate fine, high-resolution, and realistic images because adversarial loss can incorporate perceptual features that are difficult to define mathematically

Try-on module (TOM)
  • Trained adversarially against the discriminator that uses the TOM result image
  • Person representation as inputs and judges whether the result is real or fake VITON-GAN generated hands and arms
Paper #3 - The Conditional Analogy GAN: Swapping Fashion Articles on People Images

Key Notes
  • GAN - a model G that learns a data distribution from example data, and a discriminator D that attempts to distinguish generated from training data
  • These models learn loss functions that adapt to the data, and this makes them perfectly suitable for image-to-image translation tasks
  • (cGAN) learns to generate images as function of conditioning information from a dataset, instead of random noise from a prior, as in standard GAN
Training of the CAGAN model involves learning a generator G to generate plausible images which fool a discriminator D. The discriminator D needs to answer two questions:
  • does an image x look reasonable, i.e. indistinguishable from the training distribution of human images {xi}?
  • does the article y look well-painted on the human model image x
More Reads
GENERATIVE ADVERSARIAL NETWORK-BASED VIRTUAL TRY-ON WITH CLOTHING REGION

Keep Thinking!!!

No comments: