Sei sulla pagina 1di 11

LSTM & Intent

Classification
Shijie Sun
12-23-2016
Outline

Introduction to RNNs
Why LSTM?
Training an LSTM Model for Intent Classification
How to Improve?
Current Results
Challenges & Future work
Introduction to RNNs

NN (Neural Network): Origins: Algorithms that mimic the brain

(Source: Wikipedia)

Perceptron, Neuron, Activation Function


Loss Function, Feedforward, Backpropagation
ANN (Artificial), DNN (Deep), CNN (Convolutional), RNN (Recurrent)
Introduction to RNNs

RNNs (Recurrent Neural Networks)

Make use of sequential information


LSTM (Long Short Term Memory), GRU (Gated Recurrent Units)
Why LSTM?

Plain RNNs is hard to train! (Backpropagation Through Time, BPTT)


RNNs trained with BPTT have difficulties learning long-term dependencies
(vanishing/exploding gradient problem).
Now structure: LSTM (Long Short Term Memory), etc

(Source: Wikipedia) (Source: arXiv:1503.04069 [cs.NE])

Further Reference: http://colah.github.io/posts/2015-08-Understanding-LSTMs/


Training an LSTM Model for Intent
Classification

Tensorflow
Data Preprocessing
Tokenize Words
Word Embedding

Mini-batched Gradient Descent


Mini-batch
Feedforward
Loss Function: Cross Entropy
Backpropagation
How to Improve?

Split dataset to prevent overfitting (Trainset, Validset, Testset)


Generalize raw data based on pattern
Adjust learning rate dynamically
Stop Criteria
Tune the parameters (forget ratio, cell size, learning rate, etc.)
Fixed sequence length vs. dynamic sequence length
Weight loss function
Current Results

Overall Accuracy: 93.1%

Domain Precision Recall Domain Precision Recall

None 90.5% 92.1% Hotel 90.6% 56.9%


Taxi 92.8% 91.8% Flower 93.2% 93.5%
Weather 99.1% 99.6% Repair 91.7% 71.4%
Clean 90.4% 97.4% Paotui 85.3% 85.3%
Massage 74.4% 93.5% Coffee 93.1% 95.0%
Air 95.7% 95.3% Reminder 96.3% 96.3%
Train 95.7% 95.1% Complain 89.4% 76.3%

Challenges & Future Works

For training data


Few samples for some domains
Incorrect labels
Confusing judgements

For model
Lack of previous information
Key words oriented
Various interpretations
Similar expressions
Changing intent
Challenges & Future Works

For data labeling


Label all sequences or just some important sequences
Unify the judgement

For model improvement


Add more data manually to improve the performance on hard-
understanding patterns or rare situations
Involve context, i.e. involve the former sequences or their labels
Potentiality : 1/3 wrong prediction can be corrected
Challenges : qualified training data, usage of context
Thank you!
Reference
[1] colahs blog. http://colah.github.io/posts/2015-08-Understanding-LSTMs/
[2] Wildml series blog. http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-
introduction-to-rnns/

Potrebbero piacerti anche