引言

We introduce methods for turning text into numerical vectors. We introduce the TensorFlow 'embedding' feature as well.


词袋 (Bag of Words)

Here we use TensorFlow to do a one-hot-encoding of words called bag-of-words. We use this method and logistic regression to predict if a text message is spam or ham.

下载本章 Jupyter Notebook


词频-逆文本频率 (TF-IDF)

We implement Text Frequency - Inverse Document Frequency (TFIDF) with a combination of Sci-kit Learn and TensorFlow.

下载本章 Jupyter Notebook


运用Skip-Gram

Our first implementation of Word2Vec called, "skip-gram" on a movie review database.

下载本章 Jupyter Notebook


CBOW (Continuous Bag fo Words)

Next, we implement a form of Word2Vec called, "CBOW" (Continuous Bag of Words) on a movie review database. We also introduce method to saving and loading word embeddings.

This section introduces the convolution layer and the max-pool layer. We show how to chain these together in a 1D and 2D example with fully connected layers as well.

下载本章 Jupyter Notebook


Word2Vec应用实例

In this example, we use the prior saved CBOW word embeddings to improve on our TF-IDF logistic regression of movie review sentiment.

Here we show how to functionalize different layers and variables for a cleaner multi-layer neural network.

下载本章 Jupyter Notebook


Doc2Vec情感分析 (Sentiment Analysis)

Here, we introduce a Doc2Vec method (concatenation of doc and word embeddings) to improve out logistic model of movie review sentiment.

下载本章 Jupyter Notebook

神经网络学习井字棋

Given a set of tic-tac-toe boards and corresponding optimal moves, we train a neural network classification model to play. At the end of the script, we can attempt to play against the trained model.

下载本章 Jupyter Notebook

本章学习模块

tensorflow.zeros

Creates a tensor with all elements set to zero.

This operation returns a tensor of type dtype with shape shape and all elements set to zero.

>>> tf.zeros([3, 4], tf.int32)
<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]], dtype=int32)>
param shape:A list of integers, a tuple of integers, or a 1-D Tensor of type int32.
param dtype:The DType of an element in the resulting Tensor.
param name:Optional string. A name for the operation.
returns:A Tensor with all elements set to zero.

tensorflow.ones

Creates a tensor with all elements set to one (1).

See also tf.ones_like.

This operation returns a tensor of type dtype with shape shape and all elements set to one.

>>> tf.ones([3, 4], tf.int32)
<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]], dtype=int32)>
param shape:A list of integers, a tuple of integers, or a 1-D Tensor of type int32.
param dtype:Optional DType of an element in the resulting Tensor. Default is tf.float32.
param name:Optional string. A name for the operation.
returns:A Tensor with all elements set to one (1).