Run Unity Catalog in a Docker container#116
Conversation
…e hallucinations of the beloved LLM
|
PR looks cool, but I'm not qualified to give a technical review. Can you just confirm that this PR moves us in the direction of the ultimate objective of having an image on Dockerhub that users can easily pull to run UC locally? |
@MrPowers I just checked and there isn't an official account for unitycatalog on hub.docker.com That would require some github actions configurations such as secrets, to authenticate with dockerhub and be able to push the image. This PR is one step closer to the ultimate objective 😄 |
|
@Fokko Check out how it has become. |
|
@jeanboutros I think it makes sense to move the Dockerfile to the parent directory. Instead of an uber-jar, I think #96 is also a good option |
…R file generated by the assembly task.
…ild and run scripts. Also moved the build and run scripts to the bin folder
|
@Fokko thanks for the example Dockerfile. I took it and made small changes such as adding args in the beginning to give more flexibility for the build process to override the default values. I think we're in a good place now. Check the build script and the run scripts. I still think we need a docker compose but as soon as we stabilise the Dockerfiles. One thing I am concerned about is that I am no longer able to make an API call to the container. Do you think this is related to the way we are building the JAR or is it related to the flavour of the docker image that we have chosen (alpine)? |
Co-authored-by: Fokko Driesprong <[email protected]> Signed-off-by: Jean Boutros <[email protected]>
… scripts accordingly.
… scripts accordingly.
…check if the container already exists or is running already. Thanks @creechy for the suggestion.
On it now, sorry it took a while for me to respond. |
… was throwing an error when trying to read a table from the catalog
|
@farkas93 the error in the CLI is fixed. Can you try now ? |
Will be back next week wednesday from holidays. Will try then :) |
# Description of changes For this PR I took an alternative approach to unitycatalog#18 and unitycatalog#22 and created a Dockerfile with build and start scripts that require minimal intervention and interaction with the codebase. From the codebase the only change is the .gitignore where I added the .DS_Store which can be helpful in the future for contributors using Mac OS, PR unitycatalog#22 is a great start but maybe oversimplified. PR unitycatalog#18 Has good thought put into it but I wanted to stay close to the recommended way of running Unity Catalog as outlined in the project's README. I tried not to fiddle directly with the jars and use the provided `/bin/start-uc-server` to run the catalog. With this approach the Dockerfile remains focused on building the environment and any changes to how the environment should run can be made in the future inside the `start-uc-server` script rather than the Dockerfile. # Rationale of the PR This pull request introduces a way to run Unity Catalog using Docker containers. It provides a Dockerfile that builds the necessary environment and separate bash scripts for building and starting the catalog. This simplifies the process for users by requiring minimal interaction with the codebase itself. The included README provides detailed instructions on how to use these scripts to build and run the Unity Catalog container. > [!NOTE] > The `README.md` contains two API calls that create an external and an managed table. > These APIs are not working yet because they are not supported by the catalogue yet. Signed-off-by: Jean Boutros <[email protected]> --------- Signed-off-by: Jean Boutros <[email protected]> Co-authored-by: Fokko Driesprong <[email protected]> Co-authored-by: Denny Lee <[email protected]> Signed-off-by: Kevin Wang <[email protected]>
|
hey, I just faced the same problem when running the
|
|
I think this is related to the compression of parquet files and decompressing then on the fly when reading a table from the catalog. |
Description of changes
For this PR I took an alternative approach to #18 and #22 and created a Dockerfile with build and start scripts that require minimal intervention and interaction with the codebase.
From the codebase the only change is the .gitignore where I added the .DS_Store which can be helpful in the future for contributors using Mac OS,
PR #22 is a great start but maybe oversimplified.
PR #18 Has good thought put into it but I wanted to stay close to the recommended way of running Unity Catalog as outlined in the project's README. I tried not to fiddle directly with the jars and use the provided
/bin/start-uc-serverto run the catalog. With this approach the Dockerfile remains focused on building the environment and any changes to how the environment should run can be made in the future inside thestart-uc-serverscript rather than the Dockerfile.Rationale of the PR
This pull request introduces a way to run Unity Catalog using Docker containers. It provides a Dockerfile that builds the necessary environment and separate bash scripts for building and starting the catalog. This simplifies the process for users by requiring minimal interaction with the codebase itself. The included README provides detailed instructions on how to use these scripts to build and run the Unity Catalog container.
Note
The
README.mdcontains two API calls that create an external and an managed table.These APIs are not working yet because they are not supported by the catalogue yet.
Signed-off-by: Jean Boutros [email protected]