-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-10697][ML] Add lift to Association rules #22236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
92c32dd
7f052b8
4c8b7be
5970876
957a6a2
88eb571
44a0021
1b4e3b3
706303f
2407e05
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -56,23 +56,24 @@ class AssociationRules private[fpm] ( | |
| /** | ||
| * Computes the association rules with confidence above `minConfidence`. | ||
| * @param freqItemsets frequent itemset model obtained from [[FPGrowth]] | ||
| * @return a `Set[Rule[Item]]` containing the association rules. | ||
| * @return a `RDD[Rule[Item]]` containing the association rules. | ||
| * | ||
| */ | ||
| @Since("1.5.0") | ||
| def run[Item: ClassTag](freqItemsets: RDD[FreqItemset[Item]]): RDD[Rule[Item]] = { | ||
| run(freqItemsets, Map.empty[Item, Long]) | ||
| run(freqItemsets, Map.empty[Item, Double]) | ||
| } | ||
|
|
||
| /** | ||
| * Computes the association rules with confidence above `minConfidence`. | ||
| * @param freqItemsets frequent itemset model obtained from [[FPGrowth]] | ||
| * @return a `Set[Rule[Item]]` containing the association rules. The rules will be able to | ||
| * @param itemSupport map containing an item and its support | ||
| * @return a `RDD[Rule[Item]]` containing the association rules. The rules will be able to | ||
| * compute also the lift metric. | ||
| */ | ||
| @Since("2.4.0") | ||
| def run[Item: ClassTag](freqItemsets: RDD[FreqItemset[Item]], | ||
| itemSupport: Map[Item, Long]): RDD[Rule[Item]] = { | ||
| itemSupport: scala.collection.Map[Item, Double]): RDD[Rule[Item]] = { | ||
| // For candidate rule X => Y, generate (X, (Y, freq(X union Y))) | ||
| val candidates = freqItemsets.flatMap { itemset => | ||
| val items = itemset.items | ||
|
|
@@ -125,7 +126,7 @@ object AssociationRules { | |
| @Since("1.5.0") val consequent: Array[Item], | ||
| freqUnion: Double, | ||
| freqAntecedent: Double, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally these frequencies would have been Longs I think, but too late. Yes, stay consistent. |
||
| freqConsequent: Option[Long]) extends Serializable { | ||
| freqConsequent: Option[Double]) extends Serializable { | ||
|
|
||
| /** | ||
| * Returns the confidence of the rule. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -37,6 +37,7 @@ object MimaExcludes { | |
| // Exclude rules for 2.4.x | ||
| lazy val v24excludes = v23excludes ++ Seq( | ||
| // [SPARK-10697][ML] Add lift to Association rules | ||
| ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.ml.fpm.FPGrowthModel.this"), | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are for the private[ml] constructors right? OK to suppress, yes
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, they are the private ones. |
||
| ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.fpm.AssociationRules#Rule.this"), | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. note for reviewers and myself: this method is private ( |
||
| // [SPARK-24296][CORE] Replicate large blocks as a stream. | ||
| ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.network.netty.NettyBlockRpcServer.this"), | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I slightly prefer subclasses like
IllegalArgumentExceptionorIllegalStateException, but it's just a matter of taste. You can interpolate the second argument and probably get it on one line if you break before the message starts.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I'll do, thanks.