|
104 | 104 | "metadata": {}, |
105 | 105 | "source": [ |
106 | 106 | "# for class\n", |
107 | | - "do simple linearly separable case (hard margin) with y = 1/2x + 3 or something.\n", |
| 107 | + "Define hyperplane. Write equation = 0. Sides of hyperplanes determine classification. Not a probabilistic model but can use distance from hyperplane to be a proxy for certainty. Show that coefficients point orthogonal to hyperplane\n", |
| 108 | + "\n", |
| 109 | + "Dario - Maximum margin classifiers\n", |
| 110 | + "\n", |
| 111 | + "do simple linearly separable case (hard margin) with y = 1/2x + 1 with points (1,4) and (3, 0) as the support vectors\n", |
108 | 112 | "\n", |
109 | 113 | "Write data points (x1, x2), y where y is -1 or 1\n", |
110 | 114 | "\n", |
111 | | - "Make data points in a manner that one additional point of one class close to another class has tremendous influence on the line." |
| 115 | + "Make data points in a manner that one additional point of one class close to another class has tremendous influence on the line.\n", |
| 116 | + "\n", |
| 117 | + "Set up problem specification: Maximize margin subject to norm of weights = 1 and y(xb) >= M.\n", |
| 118 | + "\n", |
| 119 | + "When norm of weights =1 then y(xb) gives the distance from the point to the hyperplane. and xb = M give the equation to the support vector\n", |
| 120 | + "\n", |
| 121 | + "Support vector classifiers\n", |
| 122 | + "Non-separable case. allow for error. extremely sensitive to one data point. Soft margin classifier. Want robustness, generalization. \n", |
| 123 | + "\n", |
| 124 | + "Make specification: Maximize M, subject to norm of weights = 1 and y(xb) > M(1 - e) where sum(e) < C, errors are called slack variables. Hyperplane is still boundary for classification. \n", |
| 125 | + "\n", |
| 126 | + "Slack variables: if e = 0, on correct side of margin. if e between 0 and 1 then between margin and hyperplane. If e > 1 then misclassified.\n", |
| 127 | + "\n", |
| 128 | + "C: Budget, \"the bank\". If C = 0 then need linear separability. Chosen via cv. \n", |
| 129 | + "\n", |
| 130 | + "Only observations that lie on the margin or violate are the support vectors and the only observation that affect the model\n", |
| 131 | + "\n", |
| 132 | + "Gerardo - Support vector machines\n", |
| 133 | + "Needed for non-linear decision boundaries. Can enlarge feature space by using polynomial, interaction terms and a linear classifier can again be used. Kernel approach is very efficient computationally. The linear support vector classifier is just sum of inner product of X and each observation times a constant, but only non-zero constants are the support vectors.\n", |
| 134 | + "\n", |
| 135 | + "Instead of just the inner product, a kernel function can be used. The linear kernel is just the inner product. Kernels measure similarity." |
112 | 136 | ] |
| 137 | + }, |
| 138 | + { |
| 139 | + "cell_type": "code", |
| 140 | + "execution_count": null, |
| 141 | + "metadata": { |
| 142 | + "collapsed": true |
| 143 | + }, |
| 144 | + "outputs": [], |
| 145 | + "source": [] |
113 | 146 | } |
114 | 147 | ], |
115 | 148 | "metadata": { |
116 | 149 | "anaconda-cloud": {}, |
117 | 150 | "kernelspec": { |
118 | | - "display_name": "Python [Root]", |
| 151 | + "display_name": "Python 3", |
119 | 152 | "language": "python", |
120 | | - "name": "Python [Root]" |
| 153 | + "name": "python3" |
121 | 154 | }, |
122 | 155 | "language_info": { |
123 | 156 | "codemirror_mode": { |
|
129 | 162 | "name": "python", |
130 | 163 | "nbconvert_exporter": "python", |
131 | 164 | "pygments_lexer": "ipython3", |
132 | | - "version": "3.5.2" |
| 165 | + "version": "3.5.1" |
133 | 166 | } |
134 | 167 | }, |
135 | 168 | "nbformat": 4, |
|
0 commit comments