"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

December 16, 2018

Day #164 - Tensorflow and Deep Learning (v1.9)

Key Lessons
Introduction To Tensorflow
  • 1000+ Contributors to tensorflow outside google
  • Tensorflow is powered by C++ backend
  • Tensorflow in R, Javascript, Python
API Styles
  • Keras runs on top of Deep Learning Libraties
  • tf.keras - Vanilla Keras code can be run in Keras (v 1.9), Tensorflow implementation of Keras
  • Estimators
  • Eager execution
  • Deferred execution
All code tried on - https://colab.research.google.com
First Example - https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/keras/basic_classification.ipynb
  • Fashnion MNIST 60K images of articles and clothing
  • Train a model from training set
  • This was seamless tutorial
Code Explanations
Data Import Code
  • Download Code using load_data function()
  • Class labels are not included in dataset - Class names initialized
  • Poke around data with shapes and data 
  • Tensor is n-dimensional array, All tensors have shape
  • 60K images of 28 x 28 pixels
  • ML is data and labels (60K Labels)
ML Training Code
  • Normalize data by divide by /255.0
  • Code to iterate through dataset
Model
  • Sequential - Define neural network as stack of layers (one input - one output)
  • 95% of time simplest APIs sufficient
  • Flatten - Preprocessing - 2D images convert into Vector - Flatten to Vector - Line into one long vector List of numbers
  • Dense Layer - Every Neuron Computes features of input
  • Larger number of patterns - More layers 
  • Memorization (100% Accuracy - Overfit) vs Generalization 
  • Smaller networks more likely to generalize
  • Relu - Activation functions
  • Softmax - Output a probability Distribution over all classes
Keras
  • Define Architecture
  • Compile network (Provide optimizer (backprop, gradient descent, adam)
  • Loss function (bucket of loss functions over time we can learn it)
ML Training Code
  • Fit is other word for training
  • Passing Training images and labels
  • Traning for Epochs to find the optimal accuracy
  • Network has large number of weights from parameters
Takeaway
  • Network with Single Hidden Layer
  • Edges, Patterns, Textures are learnt across layers
  • More Layers - More complex patterns
Modify to bigger network
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation=tf.nn.relu),
    keras.layers.Dense(128, activation=tf.nn.relu),
    keras.layers.Dense(10, activation=tf.nn.softmax)
])

Second Part - Tensorflow
  • Data flow graph
  • First version of API define dataflow graph
  • Deferred Execution - Session is execution environment for DataFlow Graph
  • Package for tensflow probability (For stats folks)
Second Part
  • Sequence Parsing RNN
  • https://quickdraw.withgoogle.com/ is awesome
  • https://experiments.withgoogle.com/ai
  • Sequence of vectors (Pen positions / vectors)
Next Experiment - https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/keras/basic_text_classification.ipynb

IMDB - Bunch of Movie Reviews, Labels 1-Positive, 0-Negative Review
  • High Dimensional Data (Images/ Texts)
  • 25K Entries
  • Each Review is different length
  • Pad for uniform Length
  • Reverse Dictionary
  • Pad for length to be 256
  • Building Model
  • Embedding - Adding New Dimension
  • Not to get to involved in math in the beginning
  • Create Layers / Compile the model
  • Embedding is a way to compress
  • Better Generalize than Overfitting
  • Right values for hyperparameters - number of layers, epochs
Next Experiment - https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/tutorials/keras/basic_regression.ipynb
  • Sketch - RNN - Will do autocomplete
  • Structured data prediction
  • For Regression (Classification - Predict thing, Regression - Predict Number)
  • Predict price of house in Boston
  • 13 different features / Columns
  • Visualise datasets - https://pair-code.github.io/facets/
  • Bucket Datasets by Age
  • Understand Data
  • Setting up Experiment matters more
  • Stop training when loss on validation data is no longer decreasing
  • Average Error over entire dataset
  • Decision Trees are great way to understand data
  • Learning rate for adjusting weights
  • Normalizing data is very important
  • Data Collection, Data Cleaning and Feature Engineering are Key, Modelling is tail end job
https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/tutorials/keras/overfit_and_underfit.ipynb

Overfitting and Underfitting
  • Earlier Each review represented as list of numbers
  • One hot encoding, When word appears index will be one on that ten thousand words
Code Walkthrough
  • Creating multiple models
  • With 4 units, 16 units, 512 units
  • Overfitting with 512 
bigger_model = keras.models.Sequential([
    keras.layers.Dense(512, activation=tf.nn.relu, input_shape=(NUM_WORDS,)),
    keras.layers.Dense(512, activation=tf.nn.relu),
    keras.layers.Dense(1, activation=tf.nn.sigmoid)
])

bigger_model.compile(optimizer='adam',
                     loss='binary_crossentropy',
                     metrics=['accuracy','binary_crossentropy'])

bigger_model.summary()
  • Another Strategy Weight Regularization
  • Dropout is another technique
  • Dropping / Randomly setting to zero some of weights
  • Add Dropout after each dense layer
  • Early Stopping is a great way to look at it
  • A lot of machine learning is experimantal
  • experimental away from having awesome resources - Universal machine learning guide
  • Task - Architecture - Papers
  • Video Classification - Architecture - Papers
  • Text Classification - Architecture - Papers
  • Relu, Sigmoid were activation functions used earlier
  • Combine structual and high dimensional data - Keras 
  • Tensorflow.js - Tensorflow in javascript
  • Eager - Works like regular python, makes debugging easier
Text generation using RNN - https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/generative_examples/text_generation.ipynb
  • Predict Each character
  • Create Character to Index and Reverse Mapping
  • Each character has corresponding mapping index
  • Model expects numbers
  • Embedding - Each word will get 256 dimension representation
  • Word2vec - For creating representations
  • Generate text of any sort
  • The only input is a giant text file
More Readings
https://distill.pub/




Happy Mastering DL!!!

No comments: