[SPARK-1308] Add getNumPartitions() method to PySpark RDDs - ASF Jira

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 0.9.0
Fix Version/s: 1.1.0
Component/s: PySpark
Labels:
None

Description

In Spark, you can do this:

// Scala
val a = sc.parallelize(List(1, 2, 3, 4), 4)
a.partitions.size

Please make this possible in PySpark too.

The work-around available is quite simple:

# Python
a = sc.parallelize([1, 2, 3, 4], 4)
a._jrdd.splits().size()

Attachments

Activity

People

Assignee:: Syed A. Hashmi

Reporter:: Nicholas Chammas

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 24/Mar/14 14:02

Updated:: 09/Jun/14 07:10

Resolved:: 09/Jun/14 07:09