Commit 0dbc90d4 authored by Konstantinos Bousmalis's avatar Konstantinos Bousmalis

Merge from upstream repo.

parents 1a51e3c9 d6bee2c7
......@@ -2,9 +2,12 @@
This repository contains machine learning models implemented in
[TensorFlow](https://tensorflow.org). The models are maintained by their
respective authors.
respective authors. To propose a model for inclusion, please submit a pull
request.
To propose a model for inclusion please submit a pull request.
Currently, the models are compatible with TensorFlow 1.0 or later. If you are
running TensorFlow 0.12 or earlier, please
[upgrade your installation](https://www.tensorflow.org/install).
## Models
......
import tensorflow as tf
import numpy as np
class VariationalAutoencoder(object):
......@@ -57,8 +56,8 @@ class VariationalAutoencoder(object):
def generate(self, hidden = None):
if hidden is None:
hidden = np.random.normal(size=self.weights["b1"])
return self.sess.run(self.reconstruction, feed_dict={self.z_mean: hidden})
hidden = self.sess.run(tf.random_normal([1, self.n_hidden]))
return self.sess.run(self.reconstruction, feed_dict={self.z: hidden})
def reconstruct(self, X):
return self.sess.run(self.reconstruction, feed_dict={self.x: X})
......
......@@ -4,10 +4,17 @@
## Introduction
This code is the code used for the "Domain Separation Networks" paper
by Bousmalis K., Trigeorgis G., et al. which was presented at NIPS 2016. The
<<<<<<< HEAD
paper can be found here: https://arxiv.org/abs/1608.06019
## Contact
This code was open-sourced by Konstantinos Bousmalis ([email protected], github:bousmalis)
=======
paper can be found here: https://arxiv.org/abs/1608.06019.
## Contact
This code was open-sourced by [Konstantinos Bousmalis](https://github.com/bousmalis) ([email protected]).
>>>>>>> d6bee2c713c6aed6522ab32c34b57412d0216d95
## Installation
You will need to have the following installed on your machine before trying out the DSN code.
......@@ -19,12 +26,20 @@ You will need to have the following installed on your machine before trying out
Although we are making the code available, you are only able to use the MNIST
provider for now. We will soon provide a script to download and convert MNIST-M
as well. Check back here in a few weeks or wait for a relevant announcement from
<<<<<<< HEAD
Twitter @bousmalis.
=======
[@bousmalis](https://twitter.com/bousmalis).
>>>>>>> d6bee2c713c6aed6522ab32c34b57412d0216d95
## Running the code for adapting MNIST to MNIST-M
In order to run the MNIST to MNIST-M experiments with DANNs and/or DANNs with
domain separation (DSNs) you will need to set the directory you used to download
<<<<<<< HEAD
MNIST and MNIST-M:\
=======
MNIST and MNIST-M:
>>>>>>> d6bee2c713c6aed6522ab32c34b57412d0216d95
```
$ export DSN_DATA_DIR=/your/dir
......
......@@ -57,7 +57,7 @@ tf.app.flags.DEFINE_string(
'eval_dir', '/tmp/da/',
'Directory where we should write the tf summaries to.')
tf.app.flags.DEFINE_string('dataset_dir', '/cns/ok-d/home/konstantinos/cad_learning/',
tf.app.flags.DEFINE_string('dataset_dir', None,
'The directory where the dataset files are stored.')
tf.app.flags.DEFINE_string('dataset', 'mnist_m',
......
......@@ -37,13 +37,12 @@ The code base provides three core binaries for:
errors to fine tune the network weights.
The training procedure employs synchronous stochastic gradient descent across
multiple GPUs. The user may specify the number of GPUs they wish harness. The
multiple GPUs. The user may specify the number of GPUs they wish to harness. The
synchronous training performs *batch-splitting* by dividing a given batch across
multiple GPUs.
The training set up is nearly identical to the section [Training a Model Using
Multiple GPU Cards]
(https://www.tensorflow.org/tutorials/deep_cnn/index.html#training-a-model-using-multiple-gpu-cards)
Multiple GPU Cards](https://www.tensorflow.org/tutorials/deep_cnn/index.html#launching_and_training_the_model_on_multiple_gpu_cards)
where we have substituted the CIFAR-10 model architecture with Inception v3. The
primary differences with that setup are:
......@@ -52,8 +51,7 @@ primary differences with that setup are:
* Specify the model architecture using a (still experimental) higher level
language called TensorFlow-Slim.
For more details about TensorFlow-Slim, please see the [Slim README]
(inception/slim/README.md). Please note that this higher-level language is still
For more details about TensorFlow-Slim, please see the [Slim README](inception/slim/README.md). Please note that this higher-level language is still
*experimental* and the API may change over time depending on usage and
subsequent research.
......@@ -71,8 +69,7 @@ downloading and converting ImageNet data to TFRecord format. Downloading and
preprocessing the data may take several hours (up to half a day) depending on
your network and computer speed. Please be patient.
To begin, you will need to sign up for an account with [ImageNet]
(http://image-net.org) to gain access to the data. Look for the sign up page,
To begin, you will need to sign up for an account with [ImageNet](http://image-net.org) to gain access to the data. Look for the sign up page,
create an account and request an access key to download the data.
After you have `USERNAME` and `PASSWORD`, you are ready to run our script. Make
......@@ -101,9 +98,9 @@ The final line of the output script should read:
2016-02-17 14:30:17.287989: Finished writing all 1281167 images in data set.
```
When the script finishes you will find 1024 and 128 training and validation
files in the `DATA_DIR`. The files will match the patterns `train-????-of-1024`
and `validation-?????-of-00128`, respectively.
When the script finishes, you will find 1024 training files and 128 validation
files in the `DATA_DIR`. The files will match the patterns
`train-?????-of-01024` and `validation-?????-of-00128`, respectively.
[Congratulations!](https://www.youtube.com/watch?v=9bZkp7q19f0) You are now
ready to train or evaluate with the ImageNet data set.
......@@ -114,15 +111,12 @@ ready to train or evaluate with the ImageNet data set.
intensive task and depending on your compute setup may take several days or even
weeks.
*Before proceeding* please read the [Convolutional Neural Networks]
(https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial in
particular focus on [Training a Model Using Multiple GPU Cards]
(https://www.tensorflow.org/tutorials/deep_cnn/index.html#training-a-model-using-multiple-gpu-cards)
. The model training method is nearly identical to that described in the
*Before proceeding* please read the [Convolutional Neural Networks](https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial; in
particular, focus on [Training a Model Using Multiple GPU Cards](https://www.tensorflow.org/tutorials/deep_cnn/index.html#launching_and_training_the_model_on_multiple_gpu_cards). The model training method is nearly identical to that described in the
CIFAR-10 multi-GPU model training. Briefly, the model training
* Places an individual model replica on each GPU. Split the batch across the
GPUs.
* Places an individual model replica on each GPU.
* Splits the batch across the GPUs.
* Updates model parameters synchronously by waiting for all GPUs to finish
processing a batch of data.
......@@ -248,11 +242,9 @@ We term each machine that maintains model parameters a `ps`, short for
`ps` as the model parameters may be sharded across multiple machines.
Variables may be updated with synchronous or asynchronous gradient updates. One
may construct a an [`Optimizer`]
(https://www.tensorflow.org/api_docs/python/train.html#optimizers) in TensorFlow
that constructs the necessary graph for either case diagrammed below from
TensorFlow [Whitepaper]
(http://download.tensorflow.org/paper/whitepaper2015.pdf):
may construct a an [`Optimizer`](https://www.tensorflow.org/api_docs/python/train.html#optimizers) in TensorFlow
that constructs the necessary graph for either case diagrammed below from the
TensorFlow [Whitepaper](http://download.tensorflow.org/paper/whitepaper2015.pdf):
<div style="width:40%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%"
......@@ -383,10 +375,8 @@ training Inception in a distributed manner.
Evaluating an Inception v3 model on the ImageNet 2012 validation data set
requires running a separate binary.
The evaluation procedure is nearly identical to [Evaluating a Model]
(https://www.tensorflow.org/tutorials/deep_cnn/index.html#evaluating-a-model)
described in the [Convolutional Neural Network]
(https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial.
The evaluation procedure is nearly identical to [Evaluating a Model](https://www.tensorflow.org/tutorials/deep_cnn/index.html#evaluating_a_model)
described in the [Convolutional Neural Network](https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial.
**WARNING** Be careful not to run the evaluation and training binary on the same
GPU or else you might run out of memory. Consider running the evaluation on a
......@@ -441,17 +431,16 @@ daisy, dandelion, roses, sunflowers, tulips
There is a single automated script that downloads the data set and converts it
to the TFRecord format. Much like the ImageNet data set, each record in the
TFRecord format is a serialized `tf.Example` proto whose entries include a
JPEG-encoded string and an integer label. Please see [`parse_example_proto`]
(inception/image_processing.py) for details.
JPEG-encoded string and an integer label. Please see [`parse_example_proto`](inception/image_processing.py) for details.
The script just takes a few minutes to run depending your network connection
speed for downloading and processing the images. Your hard disk requires 200MB
of free storage. Here we select `DATA_DIR=$HOME/flowers-data` as such a location
of free storage. Here we select `DATA_DIR=/tmp/flowers-data/` as such a location
but feel free to edit accordingly.
```shell
# location of where to place the flowers data
FLOWERS_DATA_DIR=$HOME/flowers-data
FLOWERS_DATA_DIR=/tmp/flowers-data/
# build the preprocessing script.
bazel build inception/download_and_preprocess_flowers
......@@ -474,20 +463,19 @@ and `validation-?????-of-00002`, respectively.
**NOTE** If you wish to prepare a custom image data set for transfer learning,
you will need to invoke [`build_image_data.py`](inception/data/build_image_data.py) on
your custom data set. Please see the associated options and assumptions behind
this script by reading the comments section of [`build_image_data.py`]
(inception/data/build_image_data.py). Also, if your custom data has a different
this script by reading the comments section of [`build_image_data.py`](inception/data/build_image_data.py). Also, if your custom data has a different
number of examples or classes, you need to change the appropriate values in
[`imagenet_data.py`](inception/imagenet_data.py).
The second piece you will need is a trained Inception v3 image model. You have
the option of either training one yourself (See [How to Train from Scratch]
(#how-to-train-from-scratch) for details) or you can download a pre-trained
the option of either training one yourself (See [How to Train from Scratch](#how-to-train-from-scratch) for details) or you can download a pre-trained
model like so:
```shell
# location of where to place the Inception v3 model
DATA_DIR=$HOME/inception-v3-model
cd ${DATA_DIR}
INCEPTION_MODEL_DIR=$HOME/inception-v3-model
mkdir -p ${INCEPTION_MODEL_DIR}
cd ${INCEPTION_MODEL_DIR}
# download the Inception v3 model
curl -O http://download.tensorflow.org/models/image/imagenet/inception-v3-2016-03-01.tar.gz
......@@ -538,7 +526,7 @@ the flowers data set with the following command.
bazel build inception/flowers_train
# Path to the downloaded Inception-v3 model.
MODEL_PATH="${INCEPTION_MODEL_DIR}/model.ckpt-157585"
MODEL_PATH="${INCEPTION_MODEL_DIR}/inception-v3/model.ckpt-157585"
# Directory where the flowers data resides.
FLOWERS_DATA_DIR=/tmp/flowers-data/
......@@ -808,8 +796,7 @@ comments in [`image_processing.py`](inception/image_processing.py) for more deta
#### The model runs out of CPU memory.
In lieu of buying more CPU memory, an easy fix is to decrease
`--input_queue_memory_factor`. See [Adjusting Memory Demands]
(#adjusting-memory-demands).
`--input_queue_memory_factor`. See [Adjusting Memory Demands](#adjusting-memory-demands).
#### The model runs out of GPU memory.
......
......@@ -32,7 +32,7 @@ a sharded data set consisting of TFRecord files
train_directory/train-00000-of-01024
train_directory/train-00001-of-01024
...
train_directory/train-00127-of-01024
train_directory/train-01023-of-01024
and
......@@ -50,7 +50,7 @@ contains the following fields:
image/width: integer, image width in pixels
image/colorspace: string, specifying the colorspace, always 'RGB'
image/channels: integer, specifying the number of channels, always 3
image/format: string, specifying the format, always'JPEG'
image/format: string, specifying the format, always 'JPEG'
image/filename: string containing the basename of the image file
e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
......@@ -60,7 +60,7 @@ contains the following fields:
image/class/text: string specifying the human-readable version of the label
e.g. 'dog'
If you data set involves bounding boxes, please look at build_imagenet_data.py.
If your data set involves bounding boxes, please look at build_imagenet_data.py.
"""
from __future__ import absolute_import
from __future__ import division
......@@ -72,7 +72,6 @@ import random
import sys
import threading
import numpy as np
import tensorflow as tf
......@@ -199,7 +198,7 @@ def _process_image(filename, coder):
width: integer, image width in pixels.
"""
# Read the image file.
with tf.gfile.FastGFile(filename, 'r') as f:
with tf.gfile.FastGFile(filename, 'rb') as f:
image_data = f.read()
# Convert any PNG to JPEG's for consistency.
......@@ -306,7 +305,7 @@ def _process_image_files(name, filenames, texts, labels, num_shards):
spacing = np.linspace(0, len(filenames), FLAGS.num_threads + 1).astype(np.int)
ranges = []
for i in range(len(spacing) - 1):
ranges.append([spacing[i], spacing[i+1]])
ranges.append([spacing[i], spacing[i + 1]])
# Launch a thread for each batch.
print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))
......
......@@ -36,7 +36,7 @@ a sharded data set consisting of 1024 and 128 TFRecord files, respectively.
train_directory/train-00000-of-01024
train_directory/train-00001-of-01024
...
train_directory/train-00127-of-01024
train_directory/train-01023-of-01024
and
......@@ -54,7 +54,7 @@ serialized Example proto. The Example proto contains the following fields:
image/width: integer, image width in pixels
image/colorspace: string, specifying the colorspace, always 'RGB'
image/channels: integer, specifying the number of channels, always 3
image/format: string, specifying the format, always'JPEG'
image/format: string, specifying the format, always 'JPEG'
image/filename: string containing the basename of the image file
e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
......@@ -80,7 +80,7 @@ serialized Example proto. The Example proto contains the following fields:
Note that the length of xmin is identical to the length of xmax, ymin and ymax
for each example.
Running this script using 16 threads may take around ~2.5 hours on a HP Z420.
Running this script using 16 threads may take around ~2.5 hours on an HP Z420.
"""
from __future__ import absolute_import
from __future__ import division
......@@ -92,7 +92,6 @@ import random
import sys
import threading
import numpy as np
import tensorflow as tf
......@@ -435,7 +434,7 @@ def _process_image_files(name, filenames, synsets, labels, humans,
ranges = []
threads = []
for i in range(len(spacing) - 1):
ranges.append([spacing[i], spacing[i+1]])
ranges.append([spacing[i], spacing[i + 1]])
# Launch a thread for each batch.
print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))
......
......@@ -35,37 +35,38 @@
set -e
if [ -z "$1" ]; then
echo "usage download_and_preprocess_flowers.sh [data dir]"
echo "Usage: download_and_preprocess_flowers.sh [data dir]"
exit
fi
# Create the output and temporary directories.
DATA_DIR="${1%/}"
SCRATCH_DIR="${DATA_DIR}/raw-data/"
SCRATCH_DIR="${DATA_DIR}/raw-data"
mkdir -p "${DATA_DIR}"
mkdir -p "${SCRATCH_DIR}"
WORK_DIR="$0.runfiles/inception/inception"
# http://stackoverflow.com/questions/59895/getting-the-source-directory-of-a-bash-script-from-within
WORK_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
# Download the flowers data.
DATA_URL="http://download.tensorflow.org/example_images/flower_photos.tgz"
CURRENT_DIR=$(pwd)
cd "${DATA_DIR}"
TARBALL="flower_photos.tgz"
if [ ! -f ${TARBALL} ]; then
echo "Downloading flower data set."
wget -O ${TARBALL} "${DATA_URL}"
curl -o ${DATA_DIR}/${TARBALL} "${DATA_URL}"
else
echo "Skipping download of flower data."
fi
# Note the locations of the train and validation data.
TRAIN_DIRECTORY="${SCRATCH_DIR}train/"
VALIDATION_DIRECTORY="${SCRATCH_DIR}validation/"
TRAIN_DIRECTORY="${SCRATCH_DIR}/train"
VALIDATION_DIRECTORY="${SCRATCH_DIR}/validation"
# Expands the data into the flower_photos/ directory and rename it as the
# train directory.
tar xf flower_photos.tgz
tar xf ${DATA_DIR}/flower_photos.tgz
rm -rf "${TRAIN_DIRECTORY}" "${VALIDATION_DIRECTORY}"
mkdir -p "${TRAIN_DIRECTORY}"
mv flower_photos "${TRAIN_DIRECTORY}"
# Generate a list of 5 labels: daisy, dandelion, roses, sunflowers, tulips
......@@ -74,22 +75,22 @@ ls -1 "${TRAIN_DIRECTORY}" | grep -v 'LICENSE' | sed 's/\///' | sort > "${LABELS
# Generate the validation data set.
while read LABEL; do
VALIDATION_DIR_FOR_LABEL="${VALIDATION_DIRECTORY}${LABEL}"
TRAIN_DIR_FOR_LABEL="${TRAIN_DIRECTORY}${LABEL}"
VALIDATION_DIR_FOR_LABEL="${VALIDATION_DIRECTORY}/${LABEL}"
TRAIN_DIR_FOR_LABEL="${TRAIN_DIRECTORY}/${LABEL}"
# Move the first randomly selected 100 images to the validation set.
mkdir -p "${VALIDATION_DIR_FOR_LABEL}"
VALIDATION_IMAGES=$(ls -1 "${TRAIN_DIR_FOR_LABEL}" | shuf | head -100)
for IMAGE in ${VALIDATION_IMAGES}; do
mv -f "${TRAIN_DIRECTORY}${LABEL}/${IMAGE}" "${VALIDATION_DIR_FOR_LABEL}"
mv -f "${TRAIN_DIRECTORY}/${LABEL}/${IMAGE}" "${VALIDATION_DIR_FOR_LABEL}"
done
done < "${LABELS_FILE}"
# Build the TFRecords version of the image data.
cd "${CURRENT_DIR}"
BUILD_SCRIPT="${WORK_DIR}/build_image_data"
BUILD_SCRIPT="${WORK_DIR}/build_image_data.py"
OUTPUT_DIRECTORY="${DATA_DIR}"
"${BUILD_SCRIPT}" \
python "${BUILD_SCRIPT}" \
--train_directory="${TRAIN_DIRECTORY}" \
--validation_directory="${VALIDATION_DIRECTORY}" \
--output_directory="${OUTPUT_DIRECTORY}" \
......
......@@ -35,7 +35,7 @@
set -e
if [ -z "$1" ]; then
echo "usage download_and_preprocess_flowers.sh [data dir]"
echo "Usage: download_and_preprocess_flowers.sh [data dir]"
exit
fi
......@@ -53,7 +53,7 @@ cd "${DATA_DIR}"
TARBALL="flower_photos.tgz"
if [ ! -f ${TARBALL} ]; then
echo "Downloading flower data set."
wget -O ${TARBALL} "${DATA_URL}"
curl -o ${TARBALL} "${DATA_URL}"
else
echo "Skipping download of flower data."
fi
......
......@@ -26,7 +26,7 @@
# data_dir/train-00000-of-01024
# data_dir/train-00001-of-01024
# ...
# data_dir/train-00127-of-01024
# data_dir/train-01023-of-01024
#
# and
#
......@@ -49,7 +49,7 @@
set -e
if [ -z "$1" ]; then
echo "usage download_and_preprocess_imagenet.sh [data dir]"
echo "Usage: download_and_preprocess_imagenet.sh [data dir]"
exit
fi
......@@ -84,7 +84,7 @@ BOUNDING_BOX_FILE="${SCRATCH_DIR}/imagenet_2012_bounding_boxes.csv"
BOUNDING_BOX_DIR="${SCRATCH_DIR}bounding_boxes/"
"${BOUNDING_BOX_SCRIPT}" "${BOUNDING_BOX_DIR}" "${LABELS_FILE}" \
| sort >"${BOUNDING_BOX_FILE}"
| sort > "${BOUNDING_BOX_FILE}"
echo "Finished downloading and preprocessing the ImageNet data."
# Build the TFRecords version of the ImageNet data.
......
......@@ -24,7 +24,7 @@
# downloading the raw images.
#
# usage:
# ./download_imagenet.sh [dirname]
# ./download_imagenet.sh [dir name] [synsets file]
set -e
if [ "x$IMAGENET_ACCESS_KEY" == x -o "x$IMAGENET_USERNAME" == x ]; then
......
......@@ -52,11 +52,11 @@ tf.app.flags.DEFINE_boolean('log_device_placement', False,
'Whether to log device placement.')
# Task ID is used to select the chief and also to access the local_step for
# each replica to check staleness of the gradients in sync_replicas_optimizer.
# each replica to check staleness of the gradients in SyncReplicasOptimizer.
tf.app.flags.DEFINE_integer(
'task_id', 0, 'Task ID of the worker/replica running the training.')
# More details can be found in the sync_replicas_optimizer class:
# More details can be found in the SyncReplicasOptimizer class:
# tensorflow/python/training/sync_replicas_optimizer.py
tf.app.flags.DEFINE_integer('num_replicas_to_aggregate', -1,
"""Number of gradients to collect before """
......@@ -197,7 +197,6 @@ def train(target, dataset, cluster_spec):
opt = tf.train.SyncReplicasOptimizer(
opt,
replicas_to_aggregate=num_replicas_to_aggregate,
replica_id=FLAGS.task_id,
total_num_replicas=num_workers,
variable_averages=exp_moving_averager,
variables_to_average=variables_to_average)
......@@ -222,12 +221,10 @@ def train(target, dataset, cluster_spec):
with tf.control_dependencies([apply_gradients_op]):
train_op = tf.identity(total_loss, name='train_op')
# Get chief queue_runners, init_tokens and clean_up_op, which is used to
# synchronize replicas.
# More details can be found in sync_replicas_optimizer.
# Get chief queue_runners and init_tokens, which is used to synchronize
# replicas. More details can be found in SyncReplicasOptimizer.
chief_queue_runners = [opt.get_chief_queue_runner()]
init_tokens_op = opt.get_init_tokens_op()
clean_up_op = opt.get_clean_up_op()
# Create a saver.
saver = tf.train.Saver()
......@@ -301,8 +298,7 @@ def train(target, dataset, cluster_spec):
next_summary_time += FLAGS.save_summaries_secs
except:
if is_chief:
tf.logging.info('About to execute sync_clean_up_op!')
sess.run(clean_up_op)
tf.logging.info('Chief got exception while running!')
raise
# Stop the supervisor. This also waits for service threads to finish.
......
......@@ -41,23 +41,9 @@ prerequisite packages.
## Installing latest version of TF-slim
As of 8/28/16, the latest [stable release of TF](https://www.tensorflow.org/versions/r0.10/get_started/os_setup.html#pip-installation)
is r0.10, which contains most of TF-Slim but not some later additions. To obtain the
latest version, you must install the most recent nightly build of
TensorFlow. You can find the latest nightly binaries at
[TensorFlow Installation](https://github.com/tensorflow/tensorflow#installation)
in the section that reads "People who are a little more adventurous can
also try our nightly binaries". Copy the link address that corresponds to
the appropriate machine architecture and python version, and pip install
it. For example:
```shell
export TF_BINARY_URL=https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.10.0rc0-cp27-none-linux_x86_64.whl
sudo pip install --upgrade $TF_BINARY_URL
```
To test this has worked, execute the following command; it should run
without raising any errors.
TF-Slim is available as `tf.contrib.slim` via TensorFlow 1.0. To test that your
installation is working, execute the following command; it should run without
raising any errors.
```
python -c "import tensorflow.contrib.slim as slim; eval = slim.evaluation.evaluate_once"
......@@ -140,7 +126,7 @@ You can use the same script to create the mnist and cifar10 datasets.
However, for ImageNet, you have to follow the instructions
[here](https://github.com/tensorflow/models/blob/master/inception/README.md#getting-started).
Note that you first have to sign up for an account at image-net.org.
Also, the download can take several hours, and uses about 500MB.
Also, the download can take several hours, and could use up to 500GB.
## Creating a TF-Slim Dataset Descriptor.
......
......@@ -103,12 +103,21 @@ following commands:
bazel test --linkopt=-headerpad_max_install_names \
dragnn/... syntaxnet/... util/utf8/...
```
Bazel should complete reporting all tests passed.
Now you can install the SyntaxNet and DRAGNN Python modules with the following commands:
```shell
mkdir /tmp/syntaxnet_pkg
bazel-bin/dragnn/tools/build_pip_package --output-dir=/tmp/syntaxnet_pkg
# The filename of the .whl depends on your platform.
sudo pip install /tmp/syntaxnet_pkg/syntaxnet-x.xx-none-any.whl
```
To build SyntaxNet with GPU support please refer to the instructions in
[issues/248](https://github.com/tensorflow/models/issues/248).
**Note:** If you are running Docker on OSX, make sure that you have enough
memory allocated for your Docker VM.
......
......@@ -70,9 +70,7 @@ vocabulary size: Most frequent 200k words from dataset's article and summaries.
<b>How To Run</b>
Pre-requesite:
Install TensorFlow and Bazel.
Prerequisite: install TensorFlow and Bazel.
```shell
# cd to your workspace
......@@ -83,7 +81,7 @@ Install TensorFlow and Bazel.
# If your data files have different names, update the --data_path.
# If you don't have data but want to try out the model, copy the toy
# data from the textsum/data/data to the data/ directory in the workspace.
ls -R
$ ls -R
.:
data textsum WORKSPACE
......@@ -97,38 +95,38 @@ data.py seq2seq_attention_decode.py seq2seq_attention.py seq2seq_lib.py
./textsum/data:
data vocab
bazel build -c opt --config=cuda textsum/...
$ bazel build -c opt --config=cuda textsum/...
# Run the training.
bazel-bin/textsum/seq2seq_attention \
--mode=train \
--article_key=article \
--abstract_key=abstract \
--data_path=data/training-* \
--vocab_path=data/vocab \
--log_root=textsum/log_root \
--train_dir=textsum/log_root/train
$ bazel-bin/textsum/seq2seq_attention \
--mode=train \
--article_key=article \
--abstract_key=abstract \
--data_path=data/training-* \
--vocab_path=data/vocab \
--log_root=textsum/log_root \
--train_dir=textsum/log_root/train
# Run the eval. Try to avoid running on the same machine as training.
bazel-bin/textsum/seq2seq_attention \
--mode=eval \
--article_key=article \
--abstract_key=abstract \
--data_path=data/validation-* \
--vocab_path=data/vocab \
--log_root=textsum/log_root \
--eval_dir=textsum/log_root/eval
$ bazel-bin/textsum/seq2seq_attention \
--mode=eval \
--article_key=article \
--abstract_key=abstract \
--data_path=data/validation-* \
--vocab_path=data/vocab \
--log_root=textsum/log_root \
--eval_dir=textsum/log_root/eval
# Run the decode. Run it when the most is mostly converged.
bazel-bin/textsum/seq2seq_attention \
--mode=decode \
--article_key=article \
--abstract_key=abstract \
--data_path=data/test-* \
--vocab_path=data/vocab \
--log_root=textsum/log_root \
--decode_dir=textsum/log_root/decode \
--beam_size=8
$ bazel-bin/textsum/seq2seq_attention \
--mode=decode \
--article_key=article \
--abstract_key=abstract \
--data_path=data/test-* \
--vocab_path=data/vocab \
--log_root=textsum/log_root \
--decode_dir=textsum/log_root/decode \
--beam_size=8
```
......@@ -157,7 +155,7 @@ article: the european court of justice ( ecj ) recently ruled in lock v british
abstract: will british gas ecj ruling fuel holiday pay hike ?
decode: eu law requires worker 's statutory holiday pay
decode: eu law requires worker 's statutory holiday pay
======================================
......
......@@ -16,12 +16,13 @@
"""Batch reader to seq2seq attention model, with bucketing support."""
from collections import namedtuple
import Queue
from random import shuffle
from threading import Thread
import time
import numpy as np
from six.moves import queue as Queue
from six.moves import xrange
import tensorflow as tf
import data
......
......@@ -21,10 +21,11 @@ K*K results, and start over again until certain number of results are fully
decoded.
"""
from six.moves import xrange
import tensorflow as tf
FLAGS = tf.flags.FLAGS
tf.flags.DEFINE_bool('normalize_by_length', True, 'Whether normalize')
tf.flags.DEFINE_bool('normalize_by_length', True, 'Whether to normalize')
class Hypothesis(object):
......
......@@ -18,9 +18,10 @@
import os
import time
import tensorflow as tf
import beam_search
import data
from six.moves import xrange
import tensorflow as tf
FLAGS = tf.app.flags.FLAGS
tf.app.flags.DEFINE_integer('max_decode_steps', 1000000,
......
......@@ -18,9 +18,9 @@
from collections import namedtuple
import numpy as np
import tensorflow as tf
import seq2seq_lib
from six.moves import xrange
import tensorflow as tf
HParams = namedtuple('HParams',
'mode, min_lr, lr, batch_size, '
......
......@@ -56,6 +56,7 @@ from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import inspect
import time
import numpy as np
......@@ -109,8 +110,18 @@ class PTBModel(object):
# initialized to 1 but the hyperparameters of the model would need to be
# different than reported in the paper.
def lstm_cell():
return tf.contrib.rnn.BasicLSTMCell(
size, forget_bias=0.0, state_is_tuple=True)
# With the latest TensorFlow source code (as of Mar 27, 2017),
# the BasicLSTMCell will need a reuse parameter which is unfortunately not
# defined in TensorFlow 1.0. To maintain backwards compatibility, we add
# an argument check here:
if 'reuse' in inspect.getargspec(
tf.contrib.rnn.BasicLSTMCell.__init__).args:
return tf.contrib.rnn.BasicLSTMCell(
size, forget_bias=0.0, state_is_tuple=True,
reuse=tf.get_variable_scope().reuse)
else:
return tf.contrib.rnn.BasicLSTMCell(
size, forget_bias=0.0, state_is_tuple=True)
attn_cell = lstm_cell
if is_training and config.keep_prob < 1:
def attn_cell():
......@@ -136,8 +147,8 @@ class PTBModel(object):
# The alternative version of the code below is:
#
# inputs = tf.unstack(inputs, num=num_steps, axis=1)
# outputs, state = tf.nn.rnn(cell, inputs,
# initial_state=self._initial_state)
# outputs, state = tf.contrib.rnn.static_rnn(
# cell, inputs, initial_state=self._initial_state)
outputs = []
state = self._initial_state
with tf.variable_scope("RNN"):
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment