Introduction To Tensorflow
- 1000+ Contributors to tensorflow outside google
- Tensorflow is powered by C++ backend
- Tensorflow in R, Javascript, Python
- Keras runs on top of Deep Learning Libraties
- tf.keras - Vanilla Keras code can be run in Keras (v 1.9), Tensorflow implementation of Keras
- Estimators
- Eager execution
- Deferred execution
First Example - https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/keras/basic_classification.ipynb
- Fashnion MNIST 60K images of articles and clothing
- Train a model from training set
- This was seamless tutorial
Data Import Code
- Download Code using load_data function()
- Class labels are not included in dataset - Class names initialized
- Poke around data with shapes and data
- Tensor is n-dimensional array, All tensors have shape
- 60K images of 28 x 28 pixels
- ML is data and labels (60K Labels)
- Normalize data by divide by /255.0
- Code to iterate through dataset
- Sequential - Define neural network as stack of layers (one input - one output)
- 95% of time simplest APIs sufficient
- Flatten - Preprocessing - 2D images convert into Vector - Flatten to Vector - Line into one long vector List of numbers
- Dense Layer - Every Neuron Computes features of input
- Larger number of patterns - More layers
- Memorization (100% Accuracy - Overfit) vs Generalization
- Smaller networks more likely to generalize
- Relu - Activation functions
- Softmax - Output a probability Distribution over all classes
- Define Architecture
- Compile network (Provide optimizer (backprop, gradient descent, adam)
- Loss function (bucket of loss functions over time we can learn it)
- Fit is other word for training
- Passing Training images and labels
- Traning for Epochs to find the optimal accuracy
- Network has large number of weights from parameters
- Network with Single Hidden Layer
- Edges, Patterns, Textures are learnt across layers
- More Layers - More complex patterns
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
Second Part - Tensorflow
- Data flow graph
- First version of API define dataflow graph
- Deferred Execution - Session is execution environment for DataFlow Graph
- Package for tensflow probability (For stats folks)
- Sequence Parsing RNN
- https://quickdraw.withgoogle.com/ is awesome
- https://experiments.withgoogle.com/ai
- Sequence of vectors (Pen positions / vectors)
IMDB - Bunch of Movie Reviews, Labels 1-Positive, 0-Negative Review
- High Dimensional Data (Images/ Texts)
- 25K Entries
- Each Review is different length
- Pad for uniform Length
- Reverse Dictionary
- Pad for length to be 256
- Building Model
- Embedding - Adding New Dimension
- Not to get to involved in math in the beginning
- Create Layers / Compile the model
- Embedding is a way to compress
- Better Generalize than Overfitting
- Right values for hyperparameters - number of layers, epochs
- Sketch - RNN - Will do autocomplete
- Structured data prediction
- For Regression (Classification - Predict thing, Regression - Predict Number)
- Predict price of house in Boston
- 13 different features / Columns
- Visualise datasets - https://pair-code.github.io/facets/
- Bucket Datasets by Age
- Understand Data
- Setting up Experiment matters more
- Stop training when loss on validation data is no longer decreasing
- Average Error over entire dataset
- Decision Trees are great way to understand data
- Learning rate for adjusting weights
- Normalizing data is very important
- Data Collection, Data Cleaning and Feature Engineering are Key, Modelling is tail end job
https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/tutorials/keras/overfit_and_underfit.ipynb
Overfitting and Underfitting
- Earlier Each review represented as list of numbers
- One hot encoding, When word appears index will be one on that ten thousand words
Code Walkthrough
- Creating multiple models
- With 4 units, 16 units, 512 units
- Overfitting with 512
bigger_model = keras.models.Sequential([
keras.layers.Dense(512, activation=tf.nn.relu, input_shape=(NUM_WORDS,)),
keras.layers.Dense(512, activation=tf.nn.relu),
keras.layers.Dense(1, activation=tf.nn.sigmoid)
])
bigger_model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy','binary_crossentropy'])
bigger_model.summary()
- Another Strategy Weight Regularization
- Dropout is another technique
- Dropping / Randomly setting to zero some of weights
- Add Dropout after each dense layer
- Early Stopping is a great way to look at it
- A lot of machine learning is experimantal
- experimental away from having awesome resources - Universal machine learning guide
- Task - Architecture - Papers
- Video Classification - Architecture - Papers
- Text Classification - Architecture - Papers
- Relu, Sigmoid were activation functions used earlier
- Combine structual and high dimensional data - Keras
- Tensorflow.js - Tensorflow in javascript
- Eager - Works like regular python, makes debugging easier
Text generation using RNN - https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/generative_examples/text_generation.ipynb
- Predict Each character
- Create Character to Index and Reverse Mapping
- Each character has corresponding mapping index
- Model expects numbers
- Embedding - Each word will get 256 dimension representation
- Word2vec - For creating representations
- Generate text of any sort
- The only input is a giant text file
More Readings
https://distill.pub/Happy Mastering DL!!!
No comments:
Post a Comment