Skip to content

Add a Dockerfile#22

Closed
jaychia wants to merge 2 commits intounitycatalog:mainfrom
Eventual-Inc:jay/daft-tutorial
Closed

Add a Dockerfile#22
jaychia wants to merge 2 commits intounitycatalog:mainfrom
Eventual-Inc:jay/daft-tutorial

Conversation

@jaychia
Copy link

@jaychia jaychia commented Jun 14, 2024

  1. Updates the .gitignore to ignore build artifacts
  2. Adds a .dockerignore to ignore build artifacts
  3. Adds a Dockerfile to build a Docker container

One problem which I haven't yet found a workaround is that the test data for UC will reside inside the container instead of on the host filesystem. This does make quickstart for the unity.default.marksheet default table a little confusing because the data cannot found found at the storage_location on the host 😬

I think it could be cool to maybe add an additional Docker stage called quickstart which installs DuckDB and we can hook it up to stdin so that users can quickly run both unitycatalog and DuckDB in the same Docker container and get started really quickly with a bin/quickstart.sh or something.

@tdas
Copy link
Contributor

tdas commented Jun 15, 2024

There seems to be a more extensive docker support in PR #18

@dennyglee dennyglee mentioned this pull request Jun 18, 2024
@tdas
Copy link
Contributor

tdas commented Jun 20, 2024

@jaychia I did a review of #18 and this PR, and I have to say I like the small and simple approach of making the quick start work easily in this form. However, we do need a solution for making the prepopulated datasets + quick start work. Could you extend this PR to make the changes you are proposing with additional Docker stage called quickstart which installs DuckDB and we can hook it up to stdin so that users can quickly run both unitycatalog and DuckDB

@tdas tdas closed this Jun 20, 2024
@tdas tdas reopened this Jun 20, 2024
@tdas
Copy link
Contributor

tdas commented Jun 20, 2024

(sorry, my fat fingers accidentally closed the PR)

@jaychia
Copy link
Author

jaychia commented Jun 22, 2024

#18

Yeah I think it should be doable. Let me take a stab over the weekend.

@MrPowers
Copy link
Contributor

Where will we be publishing the Docker images? Dockerhub?

Can you help me understand the end-user experience we want to give?

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a great first step. We can iterate on this in a subsequent PR. For example, making sure that we make the image as small as possible (currently it is probably quite big) on publishing the image in a separate PR.

@@ -0,0 +1,9 @@
FROM openjdk:11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: Ideally we want to have a multi-stage docker here with separate JDK for building the fat-jar. Copy the fat-jar into a JRE image for runtime.

@jaychia
Copy link
Author

jaychia commented Jun 28, 2024

Sorry haven't yet had a chance to work on this! @Fokko let's hop on a call my JVM skills aren't the best :)

@jaychia
Copy link
Author

jaychia commented Jul 10, 2024

Closing in favor of #116 which looks way more fully featured :)

@jaychia jaychia closed this Jul 10, 2024
haogang pushed a commit that referenced this pull request Aug 1, 2024
# Description of changes

For this PR I took an alternative approach to #18 and #22 and created a
Dockerfile with build and start scripts that require minimal
intervention and interaction with the codebase.

From the codebase the only change is the .gitignore where I added the
.DS_Store which can be helpful in the future for contributors using Mac
OS,

PR #22 is a great start but maybe oversimplified.

PR #18 Has good thought put into it but I wanted to stay close to the
recommended way of running Unity Catalog as outlined in the project's
README. I tried not to fiddle directly with the jars and use the
provided `/bin/start-uc-server` to run the catalog. With this approach
the Dockerfile remains focused on building the environment and any
changes to how the environment should run can be made in the future
inside the `start-uc-server` script rather than the Dockerfile.

# Rationale of the PR

This pull request introduces a way to run Unity Catalog using Docker
containers. It provides a Dockerfile that builds the necessary
environment and separate bash scripts for building and starting the
catalog. This simplifies the process for users by requiring minimal
interaction with the codebase itself. The included README provides
detailed instructions on how to use these scripts to build and run the
Unity Catalog container.

> [!NOTE]
> The `README.md` contains two API calls that create an external and an
managed table.
> These APIs are not working yet because they are not supported by the
catalogue yet.

Signed-off-by: Jean Boutros <[email protected]>

---------

Signed-off-by: Jean Boutros <[email protected]>
Co-authored-by: Fokko Driesprong <[email protected]>
Co-authored-by: Denny Lee <[email protected]>
dennyglee pushed a commit that referenced this pull request Sep 2, 2024
tdas pushed a commit that referenced this pull request Sep 5, 2024
rtyler pushed a commit to rtyler/unitycatalog that referenced this pull request Sep 5, 2024
kevinzwang pushed a commit to kevinzwang/unitycatalog that referenced this pull request Oct 10, 2024
# Description of changes

For this PR I took an alternative approach to unitycatalog#18 and unitycatalog#22 and created a
Dockerfile with build and start scripts that require minimal
intervention and interaction with the codebase.

From the codebase the only change is the .gitignore where I added the
.DS_Store which can be helpful in the future for contributors using Mac
OS,

PR unitycatalog#22 is a great start but maybe oversimplified.

PR unitycatalog#18 Has good thought put into it but I wanted to stay close to the
recommended way of running Unity Catalog as outlined in the project's
README. I tried not to fiddle directly with the jars and use the
provided `/bin/start-uc-server` to run the catalog. With this approach
the Dockerfile remains focused on building the environment and any
changes to how the environment should run can be made in the future
inside the `start-uc-server` script rather than the Dockerfile.

# Rationale of the PR

This pull request introduces a way to run Unity Catalog using Docker
containers. It provides a Dockerfile that builds the necessary
environment and separate bash scripts for building and starting the
catalog. This simplifies the process for users by requiring minimal
interaction with the codebase itself. The included README provides
detailed instructions on how to use these scripts to build and run the
Unity Catalog container.

> [!NOTE]
> The `README.md` contains two API calls that create an external and an
managed table.
> These APIs are not working yet because they are not supported by the
catalogue yet.

Signed-off-by: Jean Boutros <[email protected]>

---------

Signed-off-by: Jean Boutros <[email protected]>
Co-authored-by: Fokko Driesprong <[email protected]>
Co-authored-by: Denny Lee <[email protected]>
Signed-off-by: Kevin Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants