Skip to content

Commit e557c53

Browse files
shahidki31srowen
authored andcommitted
[SPARK-26006][MLLIB] unpersist 'dataInternalRepr' in the PrefixSpan
## What changes were proposed in this pull request? Mllib's Prefixspan - run method - cached RDD stays in cache. After run is comlpeted , rdd remain in cache. We need to unpersist the cached RDD after run method. ## How was this patch tested? Existing tests Closes #23016 from shahidki31/SPARK-26006. Authored-by: Shahid <shahidki31@gmail.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>
1 parent ed46ac9 commit e557c53

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

mllib/src/main/scala/org/apache/spark/mllib/fpm/PrefixSpan.scala

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -174,6 +174,13 @@ class PrefixSpan private (
174174
val freqSequences = results.map { case (seq: Array[Int], count: Long) =>
175175
new FreqSequence(toPublicRepr(seq), count)
176176
}
177+
// Cache the final RDD to the same storage level as input
178+
if (data.getStorageLevel != StorageLevel.NONE) {
179+
freqSequences.persist(data.getStorageLevel)
180+
freqSequences.count()
181+
}
182+
dataInternalRepr.unpersist(false)
183+
177184
new PrefixSpanModel(freqSequences)
178185
}
179186

0 commit comments

Comments
 (0)