Improve README.

PiperOrigin-RevId: 187375987
tensorflow · lukaszkaiser · Mar 2, 2018 · Feb 26, 2018 · Feb 26, 2018 · Feb 27, 2018
commit 2df7a71e69ac4e43b01e009ea0b814a7619fb06c
diff --git a/README.md b/README.md
@@ -12,10 +12,10 @@ welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CO
 
 [Tensor2Tensor](https://github.com/tensorflow/tensor2tensor), or
 [T2T](https://github.com/tensorflow/tensor2tensor) for short, is a library
-of deep learning models and datasets designed to [accelerate deep learning
-research](https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html) and make it more accessible.
-
-T2T is actively used and maintained by researchers and engineers within the
+of deep learning models and datasets designed to make deep learning more
+accessible and [accelerate ML
+research](https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html).
+ is actively used and maintained by researchers and engineers within the
 [Google Brain team](https://research.google.com/teams/brain/) and a community
 of users. We're eager to collaborate with you too, so feel free to
 [open an issue on GitHub](https://github.com/tensorflow/tensor2tensor/issues)
@@ -368,6 +368,7 @@ T2T](https://research.googleblog.com/2017/06/accelerating-deep-learning-research
 * [Discrete Autoencoders for Sequence Models](https://arxiv.org/abs/1801.09797)
 * [Generating Wikipedia by Summarizing Long
    Sequences](https://arxiv.org/abs/1801.10198)
-* [Image Transformer](https://openreview.net/forum?id=r16Vyf-0-)
+* [Image Transformer](https://arxiv.org/abs/1802.05751)
+* [Training Tips for the Transformer Model](http://ufallab.ms.mff.cuni.cz/~popel/training-tips-transformer.pdf)
 
 *Note: This is not an official Google product.*
diff --git a/docs/index.md b/docs/index.md
@@ -11,8 +11,9 @@ welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CO
 
 [Tensor2Tensor](https://github.com/tensorflow/tensor2tensor), or
 [T2T](https://github.com/tensorflow/tensor2tensor) for short, is a library
-of deep learning models and datasets designed to [accelerate deep learning
-research](https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html) and make it more accessible.
+of deep learning models and datasets designed to make deep learning more
+accessible and [accelerate ML
+research](https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html).
 
 
 ## Basics

diff --git a/docs/new_problem.md b/docs/new_problem.md
@@ -9,6 +9,10 @@ welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CO
 [![Gitter](https://img.shields.io/gitter/room/nwjs/nw.js.svg)](https://gitter.im/tensor2tensor/Lobby)
 [![License](https://img.shields.io/badge/License-Apache%202.0-brightgreen.svg)](https://opensource.org/licenses/Apache-2.0)
 
+Another good overview of this part together with training is given in
+[The Cloud ML Poetry Blog
+Post](https://cloud.google.com/blog/big-data/2018/02/cloud-poetry-training-and-hyperparameter-tuning-custom-text-models-on-cloud-ml-engine)
+
 Let's add a new dataset together and train the
 [Transformer](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/models/transformer.py)
 model on it. We'll give the model a line of poetry, and it will learn to

diff --git a/docs/walkthrough.md b/docs/walkthrough.md
@@ -12,10 +12,10 @@ welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CO
 
 [Tensor2Tensor](https://github.com/tensorflow/tensor2tensor), or
 [T2T](https://github.com/tensorflow/tensor2tensor) for short, is a library
-of deep learning models and datasets designed to [accelerate deep learning
-research](https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html) and make it more accessible.
-
-T2T is actively used and maintained by researchers and engineers within the
+of deep learning models and datasets designed to make deep learning more
+accessible and [accelerate ML
+research](https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html).
+ is actively used and maintained by researchers and engineers within the
 [Google Brain team](https://research.google.com/teams/brain/) and a community
 of users. We're eager to collaborate with you too, so feel free to
 [open an issue on GitHub](https://github.com/tensorflow/tensor2tensor/issues)
@@ -368,6 +368,7 @@ T2T](https://research.googleblog.com/2017/06/accelerating-deep-learning-research
 * [Discrete Autoencoders for Sequence Models](https://arxiv.org/abs/1801.09797)
 * [Generating Wikipedia by Summarizing Long
    Sequences](https://arxiv.org/abs/1801.10198)
-* [Image Transformer](https://openreview.net/forum?id=r16Vyf-0-)
+* [Image Transformer](https://arxiv.org/abs/1802.05751)
+* [Training Tips for the Transformer Model](http://ufallab.ms.mff.cuni.cz/~popel/training-tips-transformer.pdf)
 
 *Note: This is not an official Google product.*
diff --git a/tensor2tensor/data_generators/generator_utils.py b/tensor2tensor/data_generators/generator_utils.py
@@ -493,14 +493,14 @@ def __init__(self, first_sequence, spacing=2):
     self._spacing = spacing
     self._ids = first_sequence[:]
     self._segmentation = [1] * len(first_sequence)
-    self._position = range(len(first_sequence))
+    self._position = list(range(len(first_sequence)))
 
   def add(self, ids):
     padding = [0] * self._spacing
     self._ids.extend(padding + ids)
     next_segment_num = self._segmentation[-1] + 1 if self._segmentation else 1
     self._segmentation.extend(padding + [next_segment_num] * len(ids))
-    self._position.extend(padding + range(len(ids)))
+    self._position.extend(padding + list(range(len(ids))))
 
   def can_fit(self, ids, packed_length):
     return len(self._ids) + self._spacing + len(ids) <= packed_length

diff --git a/tensor2tensor/data_generators/problem.py b/tensor2tensor/data_generators/problem.py
@@ -421,15 +421,32 @@ def get_hparams(self, model_hparams=None):
     return self._hparams
 
   def maybe_reverse_features(self, feature_map):
+    """Reverse features between inputs and targets if the problem is '_rev'."""
     if not self._was_reversed:
       return
     inputs, targets = feature_map["inputs"], feature_map["targets"]
     feature_map["inputs"], feature_map["targets"] = targets, inputs
+    if "inputs_segmentation" in feature_map:
+      inputs_seg = feature_map["inputs_segmentation"]
+      targets_seg = feature_map["targets_segmentation"]
+      feature_map["inputs_segmentation"] = targets_seg
+      feature_map["targets_segmentation"] = inputs_seg
+    if "inputs_position" in feature_map:
+      inputs_pos = feature_map["inputs_position"]
+      targets_pos = feature_map["targets_position"]
+      feature_map["inputs_position"] = targets_pos
+      feature_map["targets_position"] = inputs_pos
 
   def maybe_copy_features(self, feature_map):
     if not self._was_copy:
       return
     feature_map["targets"] = feature_map["inputs"]
+    if ("inputs_segmentation" in feature_map and
+        "targets_segmentation" not in feature_map):
+      feature_map["targets_segmentation"] = feature_map["inputs_segmentation"]
+    if ("inputs_position" in feature_map and
+        "targets_position" not in feature_map):
+      feature_map["targets_position"] = feature_map["inputs_position"]
 
   def dataset(self,
               mode,

diff --git a/tensor2tensor/data_generators/translate_enmk.py b/tensor2tensor/data_generators/translate_enmk.py
@@ -23,13 +23,10 @@
 
 from tensor2tensor.data_generators import problem
 from tensor2tensor.data_generators import text_encoder
+from tensor2tensor.data_generators import text_problems
 from tensor2tensor.data_generators import translate
 from tensor2tensor.utils import registry
 
-import tensorflow as tf
-
-FLAGS = tf.flags.FLAGS
-
 # End-of-sentence marker.
 EOS = text_encoder.EOS_ID
 
@@ -49,6 +46,10 @@
 ]]
 
 
+# See this PR on github for some results with Transformer on these Problems.
+# https://github.com/tensorflow/tensor2tensor/pull/626
+
+
 @registry.register_problem
 class TranslateEnmkSetimes32k(translate.TranslateProblem):
   """Problem spec for SETimes En-Mk translation."""
@@ -64,3 +65,16 @@ def vocab_filename(self):
   def source_data_files(self, dataset_split):
     train = dataset_split == problem.DatasetSplit.TRAIN
     return _ENMK_TRAIN_DATASETS if train else _ENMK_TEST_DATASETS
+
+
+@registry.register_problem
+class TranslateEnmkSetimesCharacters(translate.TranslateProblem):
+  """Problem spec for SETimes En-Mk translation."""
+
+  @property
+  def vocab_type(self):
+    return text_problems.VocabType.CHARACTER
+
+  def source_data_files(self, dataset_split):
+    train = dataset_split == problem.DatasetSplit.TRAIN
+    return _ENMK_TRAIN_DATASETS if train else _ENMK_TEST_DATASETS
diff --git a/tensor2tensor/data_generators/translate_envi.py b/tensor2tensor/data_generators/translate_envi.py
@@ -0,0 +1,65 @@
+# coding=utf-8
+# Copyright 2018 The Tensor2Tensor Authors.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Data generators for En-Vi translation."""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+# Dependency imports
+
+from tensor2tensor.data_generators import problem
+from tensor2tensor.data_generators import text_encoder
+from tensor2tensor.data_generators import translate
+from tensor2tensor.utils import registry
+
+# End-of-sentence marker.
+EOS = text_encoder.EOS_ID
+
+# For English-Vietnamese the IWSLT'15 corpus
+# from https://nlp.stanford.edu/projects/nmt/ is used.
+# The original dataset has 133K parallel sentences.
+_ENVI_TRAIN_DATASETS = [[
+    "https://github.com/stefan-it/nmt-en-vi/raw/master/data/train-en-vi.tgz",  # pylint: disable=line-too-long
+    ("train.en", "train.vi")
+]]
+
+# For development 1,553 parallel sentences are used.
+_ENVI_TEST_DATASETS = [[
+    "https://github.com/stefan-it/nmt-en-vi/raw/master/data/dev-2012-en-vi.tgz",  # pylint: disable=line-too-long
+    ("tst2012.en", "tst2012.vi")
+]]
+
+
+# See this PR on github for some results with Transformer on this Problem.
+# https://github.com/tensorflow/tensor2tensor/pull/611
+
+
+@registry.register_problem
+class TranslateEnviIwslt32k(translate.TranslateProblem):
+  """Problem spec for IWSLT'15 En-Vi translation."""
+
+  @property
+  def approx_vocab_size(self):
+    return 2**15  # 32768
+
+  @property
+  def vocab_filename(self):
+    return "vocab.envi.%d" % self.approx_vocab_size
+
+  def source_data_files(self, dataset_split):
+    train = dataset_split == problem.DatasetSplit.TRAIN
+    return _ENVI_TRAIN_DATASETS if train else _ENVI_TEST_DATASETS