|
| 1 | +--- |
| 2 | +layout: global |
| 3 | +title: Multilayer perceptron classifier - ML |
| 4 | +displayTitle: <a href="ml-guide.html">ML</a> - Multilayer perceptron classifier |
| 5 | +--- |
| 6 | + |
| 7 | + |
| 8 | +`\[ |
| 9 | +\newcommand{\R}{\mathbb{R}} |
| 10 | +\newcommand{\E}{\mathbb{E}} |
| 11 | +\newcommand{\x}{\mathbf{x}} |
| 12 | +\newcommand{\y}{\mathbf{y}} |
| 13 | +\newcommand{\wv}{\mathbf{w}} |
| 14 | +\newcommand{\av}{\mathbf{\alpha}} |
| 15 | +\newcommand{\bv}{\mathbf{b}} |
| 16 | +\newcommand{\N}{\mathbb{N}} |
| 17 | +\newcommand{\id}{\mathbf{I}} |
| 18 | +\newcommand{\ind}{\mathbf{1}} |
| 19 | +\newcommand{\0}{\mathbf{0}} |
| 20 | +\newcommand{\unit}{\mathbf{e}} |
| 21 | +\newcommand{\one}{\mathbf{1}} |
| 22 | +\newcommand{\zero}{\mathbf{0}} |
| 23 | +\]` |
| 24 | + |
| 25 | + |
| 26 | +Multilayer perceptron classifier (MLPC) is a classifier based on the [feedforward artificial neural network](https://en.wikipedia.org/wiki/Feedforward_neural_network). |
| 27 | +MLPC consists of multiple layers of nodes. |
| 28 | +Each layer is fully connected to the next layer in the network. Nodes in the input layer represent the input data. All other nodes maps inputs to the outputs |
| 29 | +by performing linear combination of the inputs with the node's weights `$\wv$` and bias `$\bv$` and applying an activation function. |
| 30 | +It can be written in matrix form for MLPC with `$K+1$` layers as follows: |
| 31 | +`\[ |
| 32 | +\mathrm{y}(\x) = \mathrm{f_K}(...\mathrm{f_2}(\wv_2^T\mathrm{f_1}(\wv_1^T \x+b_1)+b_2)...+b_K) |
| 33 | +\]` |
| 34 | +Nodes in intermediate layers use sigmoid (logistic) function: |
| 35 | +`\[ |
| 36 | +\mathrm{f}(z_i) = \frac{1}{1 + e^{-z_i}} |
| 37 | +\]` |
| 38 | +Nodes in the output layer use softmax function: |
| 39 | +`\[ |
| 40 | +\mathrm{f}(z_i) = \frac{e^{z_i}}{\sum_{k=1}^N e^{z_k}} |
| 41 | +\]` |
| 42 | +The number of nodes `$N$` in the output layer corresponds to the number of classes. |
| 43 | + |
| 44 | +MLPC employes backpropagation for learning the model. We use logistic loss function for optimization and L-BFGS as optimization routine. |
| 45 | + |
| 46 | +**Examples** |
| 47 | + |
| 48 | +<div class="codetabs"> |
| 49 | + |
| 50 | +<div data-lang="scala" markdown="1"> |
| 51 | + |
| 52 | +{% highlight scala %} |
| 53 | +import org.apache.spark.ml.classification.MultilayerPerceptronClassifier |
| 54 | +import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator |
| 55 | +import org.apache.spark.mllib.util.MLUtils |
| 56 | +import org.apache.spark.sql.Row |
| 57 | + |
| 58 | +// Load training data |
| 59 | +val data = MLUtils.loadLibSVMFile(sc, "data/mllib/sample_multiclass_classification_data.txt").toDF() |
| 60 | +// Split the data into train and test |
| 61 | +val splits = data.randomSplit(Array(0.6, 0.4), seed = 1234L) |
| 62 | +val train = splits(0) |
| 63 | +val test = splits(1) |
| 64 | +// specify layers for the neural network: |
| 65 | +// input layer of size 4 (features), two intermediate of size 5 and 4 and output of size 3 (classes) |
| 66 | +val layers = Array[Int](4, 5, 4, 3) |
| 67 | +// create the trainer and set its parameters |
| 68 | +val trainer = new MultilayerPerceptronClassifier() |
| 69 | + .setLayers(layers) |
| 70 | + .setBlockSize(128) |
| 71 | + .setSeed(1234L) |
| 72 | + .setMaxIter(100) |
| 73 | +// train the model |
| 74 | +val model = trainer.fit(train) |
| 75 | +// compute precision on the test set |
| 76 | +val result = model.transform(test) |
| 77 | +val predictionAndLabels = result.select("prediction", "label") |
| 78 | +val evaluator = new MulticlassClassificationEvaluator() |
| 79 | + .setMetricName("precision") |
| 80 | +println("Precision:" + evaluator.evaluate(predictionAndLabels)) |
| 81 | +{% endhighlight %} |
| 82 | + |
| 83 | +</div> |
| 84 | + |
| 85 | +<div data-lang="java" markdown="1"> |
| 86 | + |
| 87 | +{% highlight java %} |
| 88 | +import org.apache.spark.api.java.JavaRDD; |
| 89 | +import org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel; |
| 90 | +import org.apache.spark.ml.classification.MultilayerPerceptronClassifier; |
| 91 | +import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator; |
| 92 | +import org.apache.spark.mllib.regression.LabeledPoint; |
| 93 | +import org.apache.spark.mllib.util.MLUtils; |
| 94 | + |
| 95 | +// Load training data |
| 96 | +String path = "data/mllib/sample_multiclass_classification_data.txt"; |
| 97 | +JavaRDD<LabeledPoint> data = MLUtils.loadLibSVMFile(sc, path).toJavaRDD(); |
| 98 | +DataFrame dataFrame = sqlContext.createDataFrame(data, LabeledPoint.class); |
| 99 | +// Split the data into train and test |
| 100 | +DataFrame[] splits = dataFrame.randomSplit(new double[]{0.6, 0.4}, 1234L); |
| 101 | +DataFrame train = splits[0]; |
| 102 | +DataFrame test = splits[1]; |
| 103 | +// specify layers for the neural network: |
| 104 | +// input layer of size 4 (features), two intermediate of size 5 and 4 and output of size 3 (classes) |
| 105 | +int[] layers = new int[] {4, 5, 4, 3}; |
| 106 | +// create the trainer and set its parameters |
| 107 | +MultilayerPerceptronClassifier trainer = new MultilayerPerceptronClassifier() |
| 108 | + .setLayers(layers) |
| 109 | + .setBlockSize(128) |
| 110 | + .setSeed(1234L) |
| 111 | + .setMaxIter(100); |
| 112 | +// train the model |
| 113 | +MultilayerPerceptronClassificationModel model = trainer.fit(train); |
| 114 | +// compute precision on the test set |
| 115 | +DataFrame result = model.transform(test); |
| 116 | +DataFrame predictionAndLabels = result.select("prediction", "label"); |
| 117 | +MulticlassClassificationEvaluator evaluator = new MulticlassClassificationEvaluator() |
| 118 | + .setMetricName("precision"); |
| 119 | +System.out.println("Precision = " + evaluator.evaluate(predictionAndLabels)); |
| 120 | +{% endhighlight %} |
| 121 | +</div> |
| 122 | + |
| 123 | +</div> |
0 commit comments