- Word representation
Featurized representation: word enbedding
Visualizing word embeddings:
- Using word embeddings
Learn word embeddings from large text corpus.(1-100B words)
(Or download pre-trained embedding online.)
Transfer embedding to new task with smaller training set.
(say, 100k words)
Optional: Continue to finetune the word embeddings with new data.
- Properties of word embeddings
sim(ew, eking - eman + ewoman)
sim(u,v) = (u.T * v) / ||v||2 * ||v||2
- Embedding matrix
- Learning word embeddings
The softmax output the probability of juice, p(juice)
- Negative sampling
- GloVe(global vectors for word representation)
i want a glass of orange juice to go alone with my cereal.
Xij = #times i(t) appears in content of j(c)
- Sentiment classification
sentiment classification problem:
simple sentiment classification model
RNN for sentiment classification
- Debiasing word embeddings