Notebook: (Web) HDFS as a backend storage (Read & Write Mode)#2333
Notebook: (Web) HDFS as a backend storage (Read & Write Mode)#2333hayssams wants to merge 26 commits intoapache:masterfrom
Conversation
|
I don't think it's good idea to include some interpreters as dependencies onto zeppelin-zengine. |
|
@jongyoul moved HDFSCommand to zeppelin-interpreter |
|
it might be important to call this |
|
@felixcheung |
|
Is this HDFS or WebHDFS protocol?
|
|
@felixcheung Do you want me to update the docs and the code or the docs only ? |
|
both of them if it makes sense? |
# Conflicts: # .travis.yml
# Conflicts: # docs/setup/storage/storage.md
|
@felixcheung |
|
Hello @jongyoul |
|
I see perhaps value in both web hdfs and hdfs (jar client)? |
|
maybe add one property to allow user to choose which method to use. And |
# Conflicts: # file/src/main/java/org/apache/zeppelin/file/HDFSCommand.java # file/src/main/java/org/apache/zeppelin/file/WebHDFSFileInterpreter.java # file/src/test/java/org/apache/zeppelin/file/WebHDFSFileInterpreterTest.java
What is this PR for ?
This PR replaces the PR-1479 by removing any hadoop dependency using WEBHDFS as a communication protocol (code borrowed from PR1600)
Zeppelin currently supports many backends for storing notes through Apache Commons VFS.
Apache Commons VFS supports HDFS in readonly mode.
This PR makes HDFS a first class citizen by allowing users to load notes from / save notes to HDFS.
What type of PR is it?
Improvement
Todos
Task
What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1515
How should this be tested?
Update zeppelin.notebook.dir property to a value like hdfs://localhost:9000/tmp/notebook and the property zeppelin.notebook.storage to the value org.apache.zeppelin.notebook.repo.HdfsNotebookRepo
check that your notes are loaded from and stored to HDFS by listing notes using the command :
hdfs dfs -ls /tmp/notebook
Screenshots (if appropriate)
Questions: