Description
The DataFrame describe method should also include quartiles (25th, 50th, and 75th percentiles) like Pandas.
Example pandas output:
In [4]: df.describe()
Out[4]:
Unnamed: 0 displ year cyl cty hwy
count 234.000000 234.000000 234.000000 234.000000 234.000000 234.000000
mean 117.500000 3.471795 2003.500000 5.888889 16.858974 23.440171
std 67.694165 1.291959 4.509646 1.611534 4.255946 5.954643
min 1.000000 1.600000 1999.000000 4.000000 9.000000 12.000000
25% 59.250000 2.400000 1999.000000 4.000000 14.000000 18.000000
50% 117.500000 3.300000 2003.500000 6.000000 17.000000 24.000000
75% 175.750000 4.600000 2008.000000 8.000000 19.000000 27.000000
max 234.000000 7.000000 2008.000000 8.000000 35.000000 44.000000
Attachments
Issue Links
- is depended upon by
-
SPARK-21566 Python method for summary
-
- Resolved
-
-
SPARK-21584 Update R method for summary to call new implementation
-
- Resolved
-
- links to