[SPARK-22324][SQL][PYTHON][FOLLOW-UP] Update setup.py file. by ueshin · Pull Request #20089 · apache/spark

ueshin · 2017-12-27T04:51:02Z

What changes were proposed in this pull request?

This is a follow-up pr of #19884 updating setup.py file to add pyarrow dependency.

How was this patch tested?

Existing tests.

ueshin · 2017-12-27T04:52:11Z

Btw, should we add 'Programming Language :: Python :: 3.6' to classifiers?

SparkQA · 2017-12-27T05:26:33Z

Test build #85424 has finished for PR 20089 at commit 36614af.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2017-12-27T05:29:15Z

Yea, I think we could. I added the support and tested it before - SPARK-19019. I think it's okay to add it they are just metadata AFAIK.

ueshin · 2017-12-27T05:31:11Z

@HyukjinKwon Thanks! I'll add it soon.

HyukjinKwon

LGTM

Not a big deal but I know one more place we might also update - https://github.com/apache/spark/blob/master/python/README.md#python-requirements

ueshin · 2017-12-27T05:38:21Z

@HyukjinKwon I'll update it as well.

SparkQA · 2017-12-27T06:10:37Z

Test build #85425 has finished for PR 20089 at commit bee3c69.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-12-27T06:25:24Z

Test build #85426 has finished for PR 20089 at commit 896f752.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2017-12-27T07:31:12Z

python/README.md

 ## Python Requirements

-At its core PySpark depends on Py4J (currently version 0.10.6), but additional sub-packages have their own requirements (including numpy and pandas).
+At its core PySpark depends on Py4J (currently version 0.10.6), but additional sub-packages have their own requirements (including numpy, pandas, and pyarrow).


This sounds like mandatory, but I think pyarrow is still an optional choice. Right?

Yea, Pandas and PyArrow are optional. Maybe, it's nicer if we have some more details here too.

I added some more details. WDYT?

viirya · 2017-12-27T07:31:41Z

python/setup.py

            'ml': ['numpy>=1.7'],
            'mllib': ['numpy>=1.7'],
-            'sql': ['pandas>=0.19.2']
+            'sql': ['pandas>=0.19.2', 'pyarrow>=0.8.0']


If no pyarrow is installed, will setup force users to install it?

Nope, extras_require does not do anything in normal cases but they can be installed together with a dev option via pip IIRC.

SparkQA · 2017-12-27T10:25:04Z

Test build #85432 has finished for PR 20089 at commit e142e69.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2017-12-27T10:45:15Z

python/README.md

 ## Python Requirements

-At its core PySpark depends on Py4J (currently version 0.10.6), but additional sub-packages have their own requirements (including numpy and pandas).
+At its core PySpark depends on Py4J (currently version 0.10.6), but additional sub-packages might have their own requirements declared as "Extras" (including numpy, pandas, and pyarrow). You can install the requirements by specifying their extra names.


Ah, I see. How about simply like ... :

At its core PySpark depends on Py4J (currently version 0.10.6), but some additional sub-packages have their own extra requirements for some features (including numpy, pandas, and pyarrow).

for now? I just noticed we are a bit unclear on this (e.g., actually I have been under impression that NumPy is required for ML/MLlib so far) but I think this roughly describes it correctly and is good enough.

Will maybe try to make a PR to fully describe the dependencies and related features later. This PR targets PyArrow anyway.

Not a big deal anyway. I am actually fine as is too if you prefer @ueshin.

Let's use the simple one you suggested and leave the detailed description for the future prs.

HyukjinKwon · 2017-12-27T10:57:57Z

Still LGTM

SparkQA · 2017-12-27T11:50:40Z

Test build #85434 has finished for PR 20089 at commit d8d9564.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2017-12-27T11:51:33Z

Merged to master.

Add pyarrow to setup.py.

36614af

Add 'Programming Language :: Python :: 3.6' to classifiers.

bee3c69

HyukjinKwon approved these changes Dec 27, 2017

View reviewed changes

Update README.md.

896f752

viirya reviewed Dec 27, 2017

View reviewed changes

Add some details describing extra requirements.

e142e69

HyukjinKwon reviewed Dec 27, 2017

View reviewed changes

Use the simple description.

d8d9564

asfgit closed this in b8bfce5 Dec 27, 2017

Conversation

ueshin commented Dec 27, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

ueshin commented Dec 27, 2017

Uh oh!

SparkQA commented Dec 27, 2017

Uh oh!

HyukjinKwon commented Dec 27, 2017

Uh oh!

ueshin commented Dec 27, 2017

Uh oh!

HyukjinKwon left a comment

Choose a reason for hiding this comment

Uh oh!

ueshin commented Dec 27, 2017

Uh oh!

SparkQA commented Dec 27, 2017

Uh oh!

SparkQA commented Dec 27, 2017

Uh oh!

viirya Dec 27, 2017

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Dec 27, 2017

Choose a reason for hiding this comment

Uh oh!

ueshin Dec 27, 2017

Choose a reason for hiding this comment

Uh oh!

viirya Dec 27, 2017

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Dec 27, 2017

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Dec 27, 2017

Uh oh!

HyukjinKwon Dec 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Dec 27, 2017

Choose a reason for hiding this comment

Uh oh!

ueshin Dec 27, 2017

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon commented Dec 27, 2017

Uh oh!

SparkQA commented Dec 27, 2017

Uh oh!

HyukjinKwon commented Dec 27, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

HyukjinKwon Dec 27, 2017 •

edited

Loading