Skip to content

Commit 9fef280

Browse files
committed
adapt to itemsCol
1 parent d4828b7 commit 9fef280

File tree

3 files changed

+6
-5
lines changed

3 files changed

+6
-5
lines changed

examples/src/main/java/org/apache/spark/examples/ml/JavaFPGrowthExample.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,11 +44,12 @@ public static void main(String[] args) {
4444
RowFactory.create(Arrays.asList("1 2".split(" ")))
4545
);
4646
StructType schema = new StructType(new StructField[]{ new StructField(
47-
"features", new ArrayType(DataTypes.StringType, true), false, Metadata.empty())
47+
"items", new ArrayType(DataTypes.StringType, true), false, Metadata.empty())
4848
});
4949
Dataset<Row> itemsDF = spark.createDataFrame(data, schema);
5050

5151
FPGrowthModel model = new FPGrowth()
52+
.setItemsCol("items")
5253
.setMinSupport(0.5)
5354
.setMinConfidence(0.6)
5455
.fit(itemsDF);

examples/src/main/scala/org/apache/spark/examples/ml/FPGrowthExample.scala

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,9 +45,9 @@ object FPGrowthExample {
4545
"1 2 5",
4646
"1 2 3 5",
4747
"1 2")
48-
).map(t => t.split(" ")).toDF("features")
48+
).map(t => t.split(" ")).toDF("items")
4949

50-
val fpgrowth = new FPGrowth().setMinSupport(0.5).setMinConfidence(0.6)
50+
val fpgrowth = new FPGrowth().setItemsCol("items").setMinSupport(0.5).setMinConfidence(0.6)
5151
val model = fpgrowth.fit(dataset)
5252

5353
// Display frequent itemsets.

mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,7 @@ private[fpm] trait FPGrowthParams extends Params with HasPredictionCol {
117117
* Recommendation</a>. PFP distributes computation in such a way that each worker executes an
118118
* independent group of mining tasks. The FP-Growth algorithm is described in
119119
* <a href="http://dx.doi.org/10.1145/335191.335372">Han et al., Mining frequent patterns without
120-
* candidate generation</a>. Note null values in the feature column are ignored during fit().
120+
* candidate generation</a>. Note null values in the itemsCol column are ignored during fit().
121121
*
122122
* @see <a href="http://en.wikipedia.org/wiki/Association_rule_learning">
123123
* Association rule learning (Wikipedia)</a>
@@ -230,7 +230,7 @@ class FPGrowthModel private[ml] (
230230
* Then for each association rule, it will examine the input items against antecedents and
231231
* summarize the consequents as prediction. The prediction column has the same data type as the
232232
* input column(Array[T]) and will not contain existing items in the input column. The null
233-
* values in the feature columns are treated as empty sets.
233+
* values in the itemsCol columns are treated as empty sets.
234234
* WARNING: internally it collects association rules to the driver and uses broadcast for
235235
* efficiency. This may bring pressure to driver memory for large set of association rules.
236236
*/

0 commit comments

Comments
 (0)