ZEPPELIN-1115: Python - interpreter for SQL over DataFrame#1164
ZEPPELIN-1115: Python - interpreter for SQL over DataFrame#1164bzz wants to merge 9 commits intoapache:masterfrom
Conversation
docs/interpreter/python.md
Outdated
|
|
||
| ## Pandas integration | ||
| [Zeppelin Display System]({{BASE_PATH}}/displaysystem/basicdisplaysystem.html#table) provides simple API to visualize data in Pandas DataFrames, same as in Matplotlib. | ||
| Apace Zeppelin [Table Display System]({{BASE_PATH}}/displaysystem/basicdisplaysystem.html#table) provides build-in data visualization capabilities. Python interpreter leverages it to visualize Pandas DataFrames though similar `z.show()` API, same as with [Matplotlib integration](#matplotlib-integration). |
There was a problem hiding this comment.
Thank you for proof-reading! Late night commits a bad...
There was a problem hiding this comment.
You mean built-in?
And how about adding this link http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html to Pandas DataFrames? It would be helpful to users i think :)
(Great work indeed! 👍 )
d20c678 to
886949b
Compare
docs/interpreter/python.md
Outdated
|
|
||
| ## Technical description | ||
|
|
||
| For in-depth technical details on current implementation plese reffer [python/README.md](https://github.com/apache/zeppelin/blob/master/python/README.md). |
There was a problem hiding this comment.
There is a typo. plese reffer -> please refer to
|
Documentation review addressed in e432961 |
|
feedback on graceful failure addressed in a378226 |
|
Thanks for the improvement, LGTM |
11da87c to
a378226
Compare
|
Thank you guys for prompt reviews! Have added one minor TODO item to cleanup test profiles on CI, will merge after #747 |
a378226 to
0f2f852
Compare
|
Done, merging after CI ♻️ if there is no further discussion |
### What is this PR for? Add new interpreter to Python group: `%python.sql` for SQL over DataFrame support ### What type of PR is it? Improvement ### TODOs * [x] add new interpreter `%python.sql` * [x] add test * [x] make Python-dependant tests, excluded from CI * PythonInterpreterWithPythonInstalledTest * PythonPandasSqlInterpreterTest * run manually by `mvn -Dpython.test.exclude='' test -pl python -am` * [x] add docs `%python.sql` * [x] make `%python.sql` fail gracefully in case there is no Pandas or PandaSQL installed * [x] after apache#747 is merged - rebase and remove `-Dpython.test.exclude=''` from both profiles ### What is the Jira issue? [ZEPPELIN-1115](https://issues.apache.org/jira/browse/ZEPPELIN-1115) ### How should this be tested? `mvn -Dpython.test.exclude='' test -pl python -am` should pass or manually run - Given the DataFrame i.e ``` %python import pandas as pd rates = pd.read_csv("bank.csv", sep=";") ``` - SQL query it like ``` %python.sql SELECT * FROM rates LIMIT 10 ``` ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? No, no dependencies were included in source or binary release * Is there breaking changes for older versions? No * Does this needs documentation? Yes Author: Alexander Bezzubov <[email protected]> Closes apache#1164 from bzz/ZEPPELIN-1115/python/add-sql-for-dataframes and squashes the following commits: 0f2f852 [Alexander Bezzubov] Fail SQL gracefully if no python dependencies installed aca2bdf [Alexander Bezzubov] Fix typos in docs ⚡ 158ba6a [Alexander Bezzubov] Remove third-party dependant test from CI 5fe46fc [Alexander Bezzubov] Update Python Matplotlib notebook example 72884c8 [Alexander Bezzubov] Add docs for %python.sql feature e931dc4 [Alexander Bezzubov] Make test for PythonPandasSqlInterpreter usable 76bbb44 [Alexander Bezzubov] Complete implementation of the PythonPandasSqlInterpreter f6ca1eb [Alexander Bezzubov] Add %python.sql to interpreter menue 11ba490 [Alexander Bezzubov] Add draft implementation of %python.sql for DataFrames
What is this PR for?
Add new interpreter to Python group:
%python.sqlfor SQL over DataFrame supportWhat type of PR is it?
Improvement
TODOs
%python.sqlmvn -Dpython.test.exclude='' test -pl python -am%python.sql%python.sqlfail gracefully in case there is no Pandas or PandaSQL installed-Dpython.test.exclude=''from both profilesWhat is the Jira issue?
ZEPPELIN-1115
How should this be tested?
mvn -Dpython.test.exclude='' test -pl python -amshould pass or manually runGiven the DataFrame i.e
SQL query it like
Screenshots (if appropriate)
Questions: