Commit 357d82d
[SPARK-13629][ML] Add binary toggle Param to CountVectorizer
## What changes were proposed in this pull request?
It would be handy to add a binary toggle Param to CountVectorizer, as in the scikit-learn one: http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html
If set, then all non-zero counts will be set to 1.
## How was this patch tested?
unit tests
Author: Yuhao Yang <hhbyyh@gmail.com>
Closes apache#11536 from hhbyyh/cvToggle.1 parent 204c9de commit 357d82d
File tree
2 files changed
+46
-2
lines changed- mllib/src
- main/scala/org/apache/spark/ml/feature
- test/scala/org/apache/spark/ml/feature
2 files changed
+46
-2
lines changedLines changed: 28 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
206 | 206 | | |
207 | 207 | | |
208 | 208 | | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
209 | 230 | | |
210 | 231 | | |
211 | 232 | | |
| |||
232 | 253 | | |
233 | 254 | | |
234 | 255 | | |
235 | | - | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
236 | 263 | | |
237 | 264 | | |
238 | 265 | | |
| |||
Lines changed: 18 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
157 | 157 | | |
158 | 158 | | |
159 | 159 | | |
160 | | - | |
| 160 | + | |
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
| |||
168 | 168 | | |
169 | 169 | | |
170 | 170 | | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
171 | 188 | | |
172 | 189 | | |
173 | 190 | | |
| |||
0 commit comments