Skip to content

Commit d9d3504

Browse files
pkchsrowen
authored andcommitted
[SPARK-16831][PYTHON] Fixed bug in CrossValidator.avgMetrics
## What changes were proposed in this pull request? avgMetrics was summed, not averaged, across folds Author: =^_^= <maxmoroz@gmail.com> Closes #14456 from pkch/pkch-patch-1. (cherry picked from commit 639df04) Signed-off-by: Sean Owen <sowen@cloudera.com>
1 parent 063a507 commit d9d3504

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

python/pyspark/ml/tuning.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -160,6 +160,8 @@ class CrossValidator(Estimator, ValidatorParams):
160160
>>> evaluator = BinaryClassificationEvaluator()
161161
>>> cv = CrossValidator(estimator=lr, estimatorParamMaps=grid, evaluator=evaluator)
162162
>>> cvModel = cv.fit(dataset)
163+
>>> cvModel.avgMetrics[0]
164+
0.5
163165
>>> evaluator.evaluate(cvModel.transform(dataset))
164166
0.8333...
165167
@@ -228,7 +230,7 @@ def _fit(self, dataset):
228230
model = est.fit(train, epm[j])
229231
# TODO: duplicate evaluator to take extra params from input
230232
metric = eva.evaluate(model.transform(validation, epm[j]))
231-
metrics[j] += metric
233+
metrics[j] += metric/nFolds
232234

233235
if eva.isLargerBetter():
234236
bestIndex = np.argmax(metrics)

0 commit comments

Comments
 (0)