"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

February 28, 2022

Where is all interesting DS work in Fintech, Auto, Oil and Gas, FMCG, Retail, Healthcare, Insurance

Almost all the startups / core work is around delivery, eCommerce, fashion. Predominantly I could spot research papers from Swiggy, Myntra, Flipkart. They are consistent in applying and sharing their lessons with the broader community. 

Facebook/ Google they have a ton of data, they are at another level in terms of innovation, adoption. Tesla is a pioneer above all playing its own game.

The rest of fortune 100, 500, Other companies in Fintech, Auto, Oil and Gas, FMCG, Retail, Healthcare, Insurance. Hope they also come forward and share, solve problems equally challenging and innovative. Does it mean 80/20%? 80% of core work is done by only 20% of companies.

Very generic JDs, Ton of skills listed without calling out what they intend to do, what kind of future they like to build. 

Keep Thinking!!!

February 26, 2022

Data scientist - 1JD and 10 different Skills

  • Statistics 
  • Machine learning
  • MLOps
  • Cloud
  • Deep learning
  • NLP
  • Chatbots
  • Vision
  • Optimization
  • Pick and choose your expertise
My Perspectives
  • Foundation / First-principles understanding of basics
  • I keep forgetting if I don't revise, The actual project work might involve data/vision specifics depending on your consulting work
  • Doing your work vs preparing for broader understanding for interviews is not worth the time, The gap between reality vs jd is way too far
  • Pick and choose areas where you love to experiment, gain more understanding
  • Full Stack Expert vs Full Stack Exposure know yourself better :)

Keep Thinking!!!

February 21, 2022

Code Perspectives

 


Source - Link

Domain - Ideas - Algos - Prototype - Scale..

Keep Exploring!!!!


February 18, 2022

Hair styles research paper reads

Paper #1 - MichiGAN: Multi-Input-Conditioned Hair Image Generation for Portrait Editing

Code - Link

Key Notes

  • MichiGAN is capable of enabling multiple input conditions for disentangled hair manipulation.
  • Editing appearance (b), structure (c), and shape (d) while keeping the background unchanged
  • Disentangle the information of hair into a quartet of attributes – shape, structure, appearance, and background, and design deliberate representations
  • Appearance is encoded through our mask-transformed feature extracting network
  • Background encoder is placed parallel to the generation branch, which keeps background intact
  • An explicit disentanglement of hair visual attributes, and a
  • set of condition modules that implement the effective condition mechanism for each attribute with respect to its particular visual characteristics;
  • An end-to-end conditional hair generation network that provides complete and orthogonal control over all attributes individually or jointly;
  • An interactive hair editing system that enables straightforward and flexible hair manipulation through intuitive user inputs



  • We represent the hair shape as the 2D binary mask of its occupied image region
  • Backbone generation network to bootstrap the generator with specific appearance styles instead of random noises.
  • Force the GAN to reconstruct the same background content;

Loss Types - 

  • Feature matching loss. To achieve more robust training of GAN, we also adopt the discriminator feature matching loss
  • Perceptual loss. We also measure high-level feature loss with the pre-trained VGG19 model
  • Structural loss. We propose an additional structure loss to enforce the structural supervision

Paper #2 - LOHO: Latent Optimization of Hairstyles via Orthogonalization

Code is available at Link

Notes

  • Our approach decomposes hair into three attributes: perceptual structure, appearance, and style, and includes tailored losses to model each of these attributes independently.
  • Optimizing StyleGANv2’s extended latent space and noise space
  • Novel approach to perform hairstyle transfer on in-the-wild portrait images and compute the Frechet Inception Distance (FID) score. FID is used to evaluate generative models by calculating the distance between Inception [29] features for real and synthesized images in the same domain


  • Pretrained VGG [28] to extract high-level features 

Paper #3 - Applications of Generative Adversarial Networks in Hairstyle Transfer

Notes

  • InterFaceGan, StyleGan

Paper #4 - Learning to Generate and Edit Hairstyles

Notes

  • GAN model termed Hairstyle GAN (H-GAN)
  • Recognition, generation and modication of hairstyles, by using a single model.
  • VAEGAN [14] integrates the Variational Auto-Encoders (VAE) into GAN
  • InfoGAN [5] further models the noise variable z in Eq (1) by decomposing it into a latent representationy and incompressible noise z



Keep Exploring!!!

February 16, 2022

Developer Guidelines

Taking inspiration from post customizing it for developer guidelines

  1. Prioritizing cool things and how to decide what to do from product/time perspective
  2. Trusting an approach without knowing how it works - Don't do
  3. Analyze and build your data-driven perspective - Take time to think
  4. Principles to develop products and live your creative life
  5. What you should “demand” from your peers / PM Team  
  6. Dealing with Top-Down requests and focusing on effectiveness - Work vs Todo Random Requests
  7. What matters when building a product vision
  8. Dealing with Impostor Syndrome and Hero Syndrome 
  9. A framework to make your product live 100x more impactful 
  10. How to work better with peers engineers and designers
  11. Who to follow and what to read to learn more about better career goals
Keep Exploring!!!


February 13, 2022

Perspective of Interview

  • During Interview - Puzzles, Physics, Permutations - Sometimes I felt I need to be a maths teacher
  • On Job - Dirty Data, No clear problem statement, Basics of schema/data engineering we don't know but talk about Feature stores, No features but already on MLops Era
  • Working on #docker #python #api #kubeflow did not make me #fullstack #developer but it was only #fullstack #exposure, to #master need more #time #perspectives

Interview vs Reality vs Job everyone looks from their own lens. Reality is far from what we think

From LinkedIn post

Because in real life, it doesn't matter how well you crack hard algorithmic questions.

What matters is:

  • What projects you've built
  • What you've got to say
  • How you manage stakeholders
  • How you manage stress

These are the driving factors behind your success in the field.

More read - Link



Keep Thinking!!!

Fundamentals Revisited

Question #1 CNN - Why Convolution 2D?

2D convolution refers *not* to the dimension of the convolution kernel but to the dimension of the output. The output’s dimension is 2D, single channel. A single 2D convolution pass over a 3D image uses a 3D convolution kernel to obtain the 2D output

Conv2D: Input-->One filter --> 2D output

Conv3D: Input--> One filter --> 3D output

Ref - Link

Question #2 - Visualize Layers


The output across multiple layers. Link

Question #3 - Different types of Data Augmentation

Ref - Link

Keep Exploring!!!

February 10, 2022

Paper - Class-Balanced Loss Based on Effective Number of Samples

  • In general, there are two strategies: re-sampling and cost-sensitive re-weighting
  • In re-sampling, the number of examples is directly adjusted by over-sampling
  • In cost-sensitive re-weighting, we influence the loss function by assigning relatively higher costs to examples from minor classes softmax cross-entropy, sigmoid cross-entropy and focal loss.

Code - Link

Keep Exploring!!!

Fashion Segmentation Paper Read

Paper - U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection

Key Notes

  • Capture more contextual information from different scales thanks
  • Code link https://github.com/xuebinqin/U-2-Net
  • Segmenting the most visually attractive objects in an image
  • Deep features extracted by existing backbones, such as Alexnet [17], VGG [35], ResNet [12], ResNeXt [44], DenseNet [15]
  • Convolution with stride of two followed by a maxpooling with stride of two are utilized to reduce the size of the feature maps to one fourth
  • Go deeper while maintaining high resolution feature maps
  • ReSidual U-block (RSU), which is able to extract intra-stage multi-scale features 
  • Multi-scale feature extraction - A 3 × 3 filter is good for extracting local features at each layer
  • Convolution + Feature Extraction + Downsample + Upsample

  • multi-scale feature extraction target at designing new modules for extracting both local and global information from features obtained by backbone networks.

RSU mainly consists of three component (ReSidual Ublock, RSU)

  • an input convolution layer, which transforms the input feature map
  • a U-Net like symmetric encoder-decoder structure which takes the intermediate feature map as input and learns to extract and encode the multi-scale contextual information
  • a residual connection which fuses local features and the multi-scale features

Dataset - Link

  • Labelled Images Samples
  • Ground Truth / Training Images
  • After 600k iterations (with a batch size of 12), the training loss converges and the whole training process takes about 120 hours Sample data


Creating this is also the key

Usecases

  • Remove background
  • Create portrait view

Paper #2 - BASNet: Boundary-Aware Salient Object Detection

Code - Link

Background removal tool - Link

Notes

  • Architecture is composed of a densely supervised Encoder-Decoder network and a residual refinement module
  • Hybrid loss - Binary Cross Entropy (BCE), Structural SIMilarity (SSIM) and Intersectionover-Union (IoU) losses.
  • Code Link
  • It assembles a UNet-like [57] deeply supervised [31, 67] Encoder-Decoder network with a novel residual refinement module

Paper #3 - BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation

Notes

  • Spatial Path with a small stride to preserve the spatial information and generate high-resolution features
  • Context Path with a fast downsampling strategy is employed to obtain sufficient receptive field
  • Spatial Path (SP) and Context Path (CP)


Paper #4 - BiSeNet V2: Bilateral Network with Guided Aggregation for Real-time Semantic Segmentation

Notes

This architecture involves: (i) a Detail Branch, with wide channels and shallow layers to capture low-level details and generate high-resolution feature representation; (ii) a Semantic Branch, with narrow channels and deep layers to obtain high-level semantic context

Keep Exploring!!!

February 09, 2022

Human Body and Emotions

 Paper - Link

Different emotions and activation observed in human body

Emotions and their behavior/combinations

Keep Exploring!!!