Skip to content

Commit 2357980

Browse files
authored
Delete xml doc files. (dotnet#3452)
1 parent 7aa5a45 commit 2357980

File tree

4 files changed

+0
-212
lines changed

4 files changed

+0
-212
lines changed

src/Microsoft.ML.FastTree/doc.xml

Lines changed: 0 additions & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -1,97 +1,6 @@
11
<?xml version="1.0" encoding="utf-8"?>
22
<doc>
33
<members>
4-
<!--
5-
The following text describes the FastTree algorithm details.
6-
It's used for the remarks section of all FastTree-based trainers (binary, regression, ranking)
7-
-->
8-
<member name="FastTree_remarks">
9-
<remarks>
10-
<para>
11-
FastTree is an efficient implementation of the <a href='https://arxiv.org/abs/1505.01866'>MART</a> gradient boosting algorithm.
12-
Gradient boosting is a machine learning technique for regression problems.
13-
It builds each regression tree in a step-wise fashion, using a predefined loss function to measure the error for each step and corrects for it in the next.
14-
So this prediction model is actually an ensemble of weaker prediction models. In regression problems, boosting builds a series of such trees in a step-wise fashion and then selects the optimal tree using an arbitrary differentiable loss function.
15-
</para>
16-
<para>
17-
MART learns an ensemble of regression trees, which is a decision tree with scalar values in its leaves.
18-
A decision (or regression) tree is a binary tree-like flow chart, where at each interior node one decides which of the two child nodes to continue to based on one of the feature values from the input.
19-
At each leaf node, a value is returned. In the interior nodes, the decision is based on the test 'x &lt;= v' where x is the value of the feature in the input sample and v is one of the possible values of this feature.
20-
The functions that can be produced by a regression tree are all the piece-wise constant functions.
21-
</para>
22-
<para>
23-
The ensemble of trees is produced by computing, in each step, a regression tree that approximates the gradient of the loss function, and adding it to the previous tree with coefficients that minimize the loss of the new tree.
24-
The output of the ensemble produced by MART on a given instance is the sum of the tree outputs.
25-
</para>
26-
<list type='bullet'>
27-
<item><description>In case of a binary classification problem, the output is converted to a probability by using some form of calibration.</description></item>
28-
<item><description>In case of a regression problem, the output is the predicted value of the function.</description></item>
29-
<item><description>In case of a ranking problem, the instances are ordered by the output value of the ensemble.</description></item>
30-
</list>
31-
<para>For more information see:</para>
32-
<list type="bullet">
33-
<item><description><a href='https://en.wikipedia.org/wiki/Gradient_boosting#Gradient_tree_boosting'>Wikipedia: Gradient boosting (Gradient tree boosting).</a></description></item>
34-
<item><description><a href='https://projecteuclid.org/DPubS?service=UI&amp;version=1.0&amp;verb=Display&amp;handle=euclid.aos/1013203451'>Greedy function approximation: A gradient boosting machine.</a></description></item>
35-
</list>
36-
</remarks>
37-
</member>
38-
39-
<!--
40-
The following text describes the FastForest algorithm details.
41-
It's used for the remarks section of all FastForest-based trainers (regression)
42-
-->
43-
<member name="FastForest_remarks">
44-
<remarks>
45-
Decision trees are non-parametric models that perform a sequence of simple tests on inputs.
46-
This decision procedure maps them to outputs found in the training dataset whose inputs were similar to the instance being processed.
47-
A decision is made at each node of the binary tree data structure based on a measure of similarity that maps each instance recursively through the branches of the tree until the appropriate leaf node is reached and the output decision returned.
48-
<para>Decision trees have several advantages:</para>
49-
<list type='bullet'>
50-
<item><description>They are efficient in both computation and memory usage during training and prediction. </description></item>
51-
<item><description>They can represent non-linear decision boundaries.</description></item>
52-
<item><description>They perform integrated feature selection and classification. </description></item>
53-
<item><description>They are resilient in the presence of noisy features.</description></item>
54-
</list>
55-
<para>Fast forest is a random forest implementation.
56-
The model consists of an ensemble of decision trees. Each tree in a decision forest outputs a Gaussian distribution by way of prediction.
57-
An aggregation is performed over the ensemble of trees to find a Gaussian distribution closest to the combined distribution for all trees in the model.
58-
This decision forest classifier consists of an ensemble of decision trees.</para>
59-
<para>Generally, ensemble models provide better coverage and accuracy than single decision trees.
60-
Each tree in a decision forest outputs a Gaussian distribution.</para>
61-
<para>For more see: </para>
62-
<list type='bullet'>
63-
<item><description><a href='https://en.wikipedia.org/wiki/Random_forest'>Wikipedia: Random forest</a></description></item>
64-
<item><description><a href='http://jmlr.org/papers/volume7/meinshausen06a/meinshausen06a.pdf'>Quantile regression forest</a></description></item>
65-
<item><description><a href='https://blogs.technet.microsoft.com/machinelearning/2014/09/10/from-stumps-to-trees-to-forests/'>From Stumps to Trees to Forests</a></description></item>
66-
</list>
67-
</remarks>
68-
</member>
69-
70-
<!--
71-
The following text describes the GAM algorithm details.
72-
It's used for the remarks section of all GAM-based trainers (regression, binary classification)
73-
-->
74-
<member name="GAM_remarks">
75-
<remarks>
76-
<para>
77-
Generalized Additive Models, or GAMs, model the data as a set of linearly independent features
78-
similar to a linear model. For each feature, the GAM trainer learns a non-linear function,
79-
called a "shape function", that computes the response as a function of the feature's value.
80-
(In contrast, a linear model fits a linear response (e.g. a line) to each feature.)
81-
To score an example, the outputs of all the shape functions are summed and the score is the total value.
82-
</para>
83-
<para>
84-
This GAM trainer is implemented using shallow gradient boosted trees (e.g. tree stumps) to learn nonparametric
85-
shape functions, and is based on the method described in Lou, Caruana, and Gehrke.
86-
<a href='http://www.cs.cornell.edu/~yinlou/papers/lou-kdd12.pdf'>&quot;Intelligible Models for Classification and Regression.&quot;</a> KDD&apos;12, Beijing, China. 2012.
87-
After training, an intercept is added to represent the average prediction over the training set,
88-
and the shape functions are normalized to represent the deviation from the average prediction. This results
89-
in models that are easily interpreted simply by inspecting the intercept and the shape functions.
90-
See the sample below for an example of how to train a GAM model and inspect and interpret the results.
91-
</para>
92-
</remarks>
93-
</member>
94-
954
<member name="TreeEnsembleFeaturizerTransform">
965
<summary>
976
Trains a tree ensemble, or loads it from a file, then maps a numeric feature vector to outputs.

src/Microsoft.ML.LightGbm/doc.xml

Lines changed: 0 additions & 16 deletions
This file was deleted.

src/Microsoft.ML.StandardTrainers/Standard/LogisticRegression/doc.xml

Lines changed: 0 additions & 68 deletions
This file was deleted.

src/Microsoft.ML.StandardTrainers/Standard/doc.xml

Lines changed: 0 additions & 37 deletions
This file was deleted.

0 commit comments

Comments
 (0)