Skip to content

Commit 5eab030

Browse files
committed
DOC copyedit SGDClassifier docstring
Make it clear immediately that this beast fits SVMs by default, logistic regression if you ask for it.
1 parent e24a307 commit 5eab030

File tree

1 file changed

+16
-12
lines changed

1 file changed

+16
-12
lines changed

sklearn/linear_model/stochastic_gradient.py

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -520,11 +520,18 @@ def fit(self, X, y, coef_init=None, intercept_init=None,
520520

521521

522522
class SGDClassifier(BaseSGDClassifier, _LearntSelectorMixin):
523-
"""Linear model fitted by minimizing a regularized empirical loss with SGD.
523+
"""Linear classifiers (SVM, logistic regression, a.o.) with SGD training.
524524
525-
SGD stands for Stochastic Gradient Descent: the gradient of the loss is
526-
estimated each sample at a time and the model is updated along the way with
527-
a decreasing strength schedule (aka learning rate).
525+
This estimator implements regularized linear models with stochastic
526+
gradient descent (SGD) learning: the gradient of the loss is estimated
527+
each sample at a time and the model is updated along the way with a
528+
decreasing strength schedule (aka learning rate). SGD allows minibatch
529+
(online/out-of-core) learning, see the partial_fit method.
530+
531+
This implementation works with data represented as dense or sparse arrays
532+
of floating point values for the features. The model it fits can be
533+
controlled with the loss parameter; by default, it fits a linear support
534+
vector machine (SVM).
528535
529536
The regularizer is a penalty added to the loss function that shrinks model
530537
parameters towards the zero vector using either the squared euclidean norm
@@ -533,19 +540,16 @@ class SGDClassifier(BaseSGDClassifier, _LearntSelectorMixin):
533540
update is truncated to 0.0 to allow for learning sparse models and achieve
534541
online feature selection.
535542
536-
This implementation works with data represented as dense or sparse arrays
537-
of floating point values for the features.
538-
539543
Parameters
540544
----------
541545
loss : str, 'hinge', 'log', 'modified_huber', 'squared_hinge',\
542546
'perceptron', or a regression loss: 'squared_loss', 'huber',\
543547
'epsilon_insensitive', or 'squared_epsilon_insensitive'
544-
The loss function to be used. Defaults to 'hinge'. The hinge loss is
545-
a margin loss used by standard linear SVM models. The 'log' loss is
546-
the loss of logistic regression models and can be used for
547-
probability estimation in binary classifiers. 'modified_huber'
548-
is another smooth loss that brings tolerance to outliers.
548+
The loss function to be used. Defaults to 'hinge', which gives a
549+
linear SVM.
550+
The 'log' loss gives logistic regression, a probabilistic classifier.
551+
'modified_huber' is another smooth loss that brings tolerance to
552+
outliers as well as probability estimates.
549553
'squared_hinge' is like hinge but is quadratically penalized.
550554
'perceptron' is the linear loss used by the perceptron algorithm.
551555
The other losses are designed for regression but can be useful in

0 commit comments

Comments
 (0)