apache · bllchmbrs · Mar 25, 2015 · Mar 25, 2015 · Dec 7, 2015 · Dec 7, 2015
diff --git a/docs/ml-guide.md b/docs/ml-guide.md
@@ -951,4 +951,71 @@ model.transform(test)
 {% endhighlight %}
 </div>
 
-</div>
+</div>
+
+## Example: Saving and Loading a Previously Created Model Pipeline
+
+Often times it is worth it to save a model to disk for usage later. In Spark 1.6, similar model import/export functionality was added to the Pipeline API. Most basic transformers are supported as well as some of the more basic ML Models such as:
+
+* K-Means
+* Naive Bayes
+* ALS
+* Linear Regression
+* Logistic Regression
+
+Below is an example of how a pipeline can be persisted and loaded. This example uses a model that we trained and created above.
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+import org.apache.spark.ml.Pipeline
+import org.apache.spark.ml.classification.LogisticRegression
+import org.apache.spark.ml.feature.{HashingTF, Tokenizer}
+import org.apache.spark.mllib.linalg.Vector
+import org.apache.spark.sql.Row
+
+// Prepare training documents from a list of (id, text, label) tuples.
+val training = sqlContext.createDataFrame(Seq(
+  (0L, "a b c d e spark", 1.0),
+  (1L, "b d", 0.0),
+  (2L, "spark f g h", 1.0),
+  (3L, "hadoop mapreduce", 0.0)
+)).toDF("id", "text", "label")
+
+// Configure an ML pipeline, which consists of three stages: tokenizer, hashingTF, and lr.
+val tokenizer = new Tokenizer()
+  .setInputCol("text")
+  .setOutputCol("words")
+val hashingTF = new HashingTF()
+  .setNumFeatures(1000)
+  .setInputCol(tokenizer.getOutputCol)
+  .setOutputCol("features")
+val lr = new LogisticRegression()
+  .setMaxIter(10)
+  .setRegParam(0.01)
+
+val pipeline = new Pipeline()
+  .setStages(Array(tokenizer, hashingTF, lr))
+
+val model = pipeline.fit(training)
+model.save("/tmp/spark-logistic-regression-model")
+
+// load in the model
+val loadedModel = Pipeline.load("/tmp/spark-logistic-regression-model")
+// or equivalently
+val loadedModel = Pipeline.read.load("/tmp/spark-logistic-regression-model")
+
+val test = sqlContext.createDataFrame(Seq(
+  (4L, "spark i j k"),
+  (5L, "l m n"),
+  (6L, "mapreduce spark"),
+  (7L, "apache hadoop")
+)).toDF("id", "text")
+
+// Make predictions on test documents.
+loadedModel.transform(test)
+  .select("id", "text", "probability", "prediction")
+  .collect()
+  .foreach { case Row(id: Long, text: String, prob: Vector, prediction: Double) =>
+    println(s"($id, $text) --> prob=$prob, prediction=$prediction")
+  }
+{% endhighlight %}
+</div>