Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): Day #132

September 18, 2018

Day #132 - Sequence Learning Paper

Understanding NLP and Deep learning requires understanding the research papers behind it. Listed below are readings and important points for my reference (copied from the paper)

Paper #1 - Sequence Learning
Captured are summary of keypoints for Sequence Learning paper

Introduction

Multilayered Long Short-Term Memory (LSTM)
DNNs can only be applied to problems whose inputs and targets can be sensibly encoded with vectors of fixed dimensionality
We are mapping a sequence of words representing the question to a sequence of words representing the answer
LSTM learns to map an input sentence of variable length into a fixed-dimensional vector representation

Model

Map the input sequence to a fixed-sized vector using one RNN
The goal of the LSTM is to estimate the conditional probability
First, we used two different LSTMs: one for the input sequence and another for the output sequence

Paper #2 - Massive Exploration of Neural Machine Translation Architectures
https://arxiv.org/pdf/1703.03906.pdf

NMT - an end-to-end approach to automated translation

Based on an encoder-decoder architecture consisting of two recurrent neural networks (RNNs) and an attention mechanism that aligns target with source tokens
Shortcoming - amount of compute required to train them

NMT

Encoder-decoder architecture with attention mechanism
An encoder function fenc takes as input a sequence of source tokens x and produces a sequence of states h
Decoder is an RNN that predicts the probability of a target sequence y
Decoder RNN also uses context vector - called the attention vector and is calculated as a weighted average of the source states

Attention Mechanism

Commonly used attention mechanisms are the additive
Given an attention key h (an encoder state) and attention query s (a decoder state), the attention score for each pair is calculated

Paper #3 - NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE - https://arxiv.org/pdf/1409.0473.pdf

Happy Learning!!!

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

September 18, 2018

Day #132 - Sequence Learning Paper

No comments:

About Me

What is your Expertise

Search This Blog

Git Code Repository

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts