Description
There are several APIs missing in PySpark:
RDD.collectPartitions()
RDD.histogram()
RDD.zipWithIndex()
RDD.zipWithUniqueId()
RDD.min(comp)
RDD.max(comp)
A bunch of API related to approximate jobs.
There are several APIs missing in PySpark:
RDD.collectPartitions()
RDD.histogram()
RDD.zipWithIndex()
RDD.zipWithUniqueId()
RDD.min(comp)
RDD.max(comp)
A bunch of API related to approximate jobs.