Conversation
|
There seems to be a more extensive docker support in PR #18 |
|
@jaychia I did a review of #18 and this PR, and I have to say I like the small and simple approach of making the quick start work easily in this form. However, we do need a solution for making the prepopulated datasets + quick start work. Could you extend this PR to make the changes you are proposing with |
|
(sorry, my fat fingers accidentally closed the PR) |
|
Yeah I think it should be doable. Let me take a stab over the weekend. |
|
Where will we be publishing the Docker images? Dockerhub? Can you help me understand the end-user experience we want to give? |
Fokko
left a comment
There was a problem hiding this comment.
I think this is a great first step. We can iterate on this in a subsequent PR. For example, making sure that we make the image as small as possible (currently it is probably quite big) on publishing the image in a separate PR.
| @@ -0,0 +1,9 @@ | |||
| FROM openjdk:11 | |||
There was a problem hiding this comment.
Optional: Ideally we want to have a multi-stage docker here with separate JDK for building the fat-jar. Copy the fat-jar into a JRE image for runtime.
|
Sorry haven't yet had a chance to work on this! @Fokko let's hop on a call my JVM skills aren't the best :) |
|
Closing in favor of #116 which looks way more fully featured :) |
# Description of changes For this PR I took an alternative approach to #18 and #22 and created a Dockerfile with build and start scripts that require minimal intervention and interaction with the codebase. From the codebase the only change is the .gitignore where I added the .DS_Store which can be helpful in the future for contributors using Mac OS, PR #22 is a great start but maybe oversimplified. PR #18 Has good thought put into it but I wanted to stay close to the recommended way of running Unity Catalog as outlined in the project's README. I tried not to fiddle directly with the jars and use the provided `/bin/start-uc-server` to run the catalog. With this approach the Dockerfile remains focused on building the environment and any changes to how the environment should run can be made in the future inside the `start-uc-server` script rather than the Dockerfile. # Rationale of the PR This pull request introduces a way to run Unity Catalog using Docker containers. It provides a Dockerfile that builds the necessary environment and separate bash scripts for building and starting the catalog. This simplifies the process for users by requiring minimal interaction with the codebase itself. The included README provides detailed instructions on how to use these scripts to build and run the Unity Catalog container. > [!NOTE] > The `README.md` contains two API calls that create an external and an managed table. > These APIs are not working yet because they are not supported by the catalogue yet. Signed-off-by: Jean Boutros <[email protected]> --------- Signed-off-by: Jean Boutros <[email protected]> Co-authored-by: Fokko Driesprong <[email protected]> Co-authored-by: Denny Lee <[email protected]>
# Description of changes For this PR I took an alternative approach to unitycatalog#18 and unitycatalog#22 and created a Dockerfile with build and start scripts that require minimal intervention and interaction with the codebase. From the codebase the only change is the .gitignore where I added the .DS_Store which can be helpful in the future for contributors using Mac OS, PR unitycatalog#22 is a great start but maybe oversimplified. PR unitycatalog#18 Has good thought put into it but I wanted to stay close to the recommended way of running Unity Catalog as outlined in the project's README. I tried not to fiddle directly with the jars and use the provided `/bin/start-uc-server` to run the catalog. With this approach the Dockerfile remains focused on building the environment and any changes to how the environment should run can be made in the future inside the `start-uc-server` script rather than the Dockerfile. # Rationale of the PR This pull request introduces a way to run Unity Catalog using Docker containers. It provides a Dockerfile that builds the necessary environment and separate bash scripts for building and starting the catalog. This simplifies the process for users by requiring minimal interaction with the codebase itself. The included README provides detailed instructions on how to use these scripts to build and run the Unity Catalog container. > [!NOTE] > The `README.md` contains two API calls that create an external and an managed table. > These APIs are not working yet because they are not supported by the catalogue yet. Signed-off-by: Jean Boutros <[email protected]> --------- Signed-off-by: Jean Boutros <[email protected]> Co-authored-by: Fokko Driesprong <[email protected]> Co-authored-by: Denny Lee <[email protected]> Signed-off-by: Kevin Wang <[email protected]>
.gitignoreto ignore build artifacts.dockerignoreto ignore build artifactsDockerfileto build a Docker containerOne problem which I haven't yet found a workaround is that the test data for UC will reside inside the container instead of on the host filesystem. This does make quickstart for the
unity.default.marksheetdefault table a little confusing because the data cannot found found at the storage_location on the host 😬I think it could be cool to maybe add an additional Docker stage called
quickstartwhich installs DuckDB and we can hook it up to stdin so that users can quickly run both unitycatalog and DuckDB in the same Docker container and get started really quickly with abin/quickstart.shor something.